1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546
|
<?xml version="1.0" encoding="iso-8859-1"?>
<!-- http://www.slideml.org/specification/slideml_1.0/ -->
<s:slideset xmlns:s="http://www.oscom.org/2003/SlideML/1.0/"
xmlns="http://www.w3.org/1999/xhtml"
xmlns:dc="http://purl.org/dc/elements/1.1"
xmlns:dct="http://purl.org/dc/terms/"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>
<s:metadata>
<s:title>Samizdat</s:title>
<s:subtitle>RDF model for an open publishing and cooperation engine</s:subtitle>
<s:author>
<s:givenname>Dmitry</s:givenname>
<s:familyname>Borodaenko</s:familyname>
<s:email>d.borodaenko@sam-solutions.net</s:email>
</s:author>
<s:confgroup>
<s:confdates>
<s:start>2003-05-28</s:start>
<s:end>2003-05-30</s:end>
</s:confdates>
<s:conftitle href="http://oscom.org/Conferences/Cambridge/">Third International OSCOM Conference</s:conftitle>
<s:address>Berkman Center for Internet and Society, Harvard Law School
</s:address>
</s:confgroup>
<dc:subject>Samizdat, Open Publishing, RDF, Squish, Ruby</dc:subject>
<dc:date>2003-05-30</dc:date>
<dc:rights>Dmitry Borodaenko</dc:rights>
<s:abstract>
<!--
Since you were supposed to come prepared, I will not
read this slide for you. Instead, today I will try to
explain the things about Samizdat that you probably
didn't understand. I also hope that after I've explained
that, you will have more questions than you had before.
The questions I want to cover first of all are: What
Samizdat is for? Why is it needed? What is novel in it?
Where else can this be used? Once we're done with that,
we will discuss other questions you may have.
-->
<p><a href="http://www.nongnu.org/samizdat/">Samizdat</a> is a
generic RDF-based engine for building collaboration and open
publishing web sites. Samizdat will let everyone publish, view,
comment, edit, and aggregate text and multimedia resources, vote on
ratings and classifications, filter resources by flexible sets of
criteria, cooperate and coordinate on all kinds of activities.</p>
<p>The talk gives perspective on open publishing systems, their
advantages and limitations, explains design goals of the Samizdat
project, and describes an RDF model that provides transparent and
extensible mapping of open publishing site resources into RDF
semantics.</p>
</s:abstract>
</s:metadata>
<s:slide>
<s:title>Open Publishing</s:title>
<s:content>
<!--
Ok, let's start with the easy part. What all this open
publishing is about? First of all, it is about trust.
-->
<p>An answer to corporate bias in mass-media</p>
<!--
You may think that the most important value of openness
is cooperation: after all, where would free software be
without people all around the world working on it all
together? Well, when it comes to publishing and media,
especially mass media, cooperation is not enough.
I think most of you agree that we already have enough
information, thank you very much. The problem with this
ocean of information is filtering out reliable
information that you can trust. And the problem with
traditional corporate mass-media is that they do not
represent their readers, they represent interests of
their owners and advertisers, and that is not the same
thing.
So, how can you trust that your information is not
biased by interests of it's source? That is what open
publishing is about.
-->
<p>Process of creating content is transparent to the readers</p>
<!--
First principle of open publishing is transparency. To
believe something, you should be able to check where the
information comes from, who made which modifications to
it, and how other decisions about it were made.
-->
<p>Readers' contributions are immediately available</p>
<!--
Equally important is the principle of open
participation. Any reader should be able to submit
information and see that it is immediately available to
others. That way, you can be sure when you receive some
information that if it were false it could immediately
be refuted, or at least openly discussed from different
positions.
-->
<p>Readers can see and participate in editorial decisions</p>
<!--
Another key principle of open publishing, that is not
fully implemented in existing open publishing software,
is open editing. As is shown by experience of
Independent Media Centers, even open media can not exist
without editing: it is relatively easy to overflow open
site with low-quality or downright offtopic content,
especially when you really intend to render it unusable.
Absense of open editing mechanisms forces editors to
manually moderate comments, thus undermining the whole
idea of openness with a censorship.
-->
<p>Content can be freely redistributed</p>
<!--
The last one is free redistribution. Well, there are
several nice talks about exactly that on the Track Two,
so it is not necessary to discuss this one here.
-->
</s:content>
</s:slide>
<s:slide>
<s:title>Engine: Active</s:title>
<s:content>
<!--
Ok, enough with theory, let's look at what software we
can use to solve the problem of open publishing.
-->
<p>First and most popular engine on IndyMedia.org network</p>
<!--
It all started with the Active engine used in the first
Independent Media Center in Seattle. At the time it
first appeared in 1999, it did its job, and still does,
on most sites of the IndyMedia network. Although it
wasn't the first engine to allow transparent and open
participation, it was the first engine to explicitly
focus on idea of open publishing.
-->
<p>Implemented in PHP</p>
<p>Publish images and multimedia</p>
<!--
What was new and very important at the time, was the
ability to publish images and multimedia files. Then,
ability to share live experience of protests with all
the world played important role in popularization of the
movement against corporate globalization.
-->
<p>Focus on simplicity</p>
<!--
Another reason for Active popularity was its simplicity:
it was easy to set up and extremely easy to use.
-->
<p>No open moderation or open editing</p>
<!--
However, with growth of the IndyMedia network Active
started to show its weaknesses. Some of them were just
implementational problem, but some, such as lack of open
editing, were fundamental enough to be impossible to
solve by patching the old code.
-->
</s:content>
</s:slide>
<s:slide>
<s:title>The need for better open publishing engine</s:title>
<s:content>
<p>IMC tech meeting, January 2002</p>
<!--
The problems of the Active didn't go unnoticed. Some IMC
sites started to adopt other web engines, such us Slash,
some started to develop their own, such as MirCode. In
January 2002, several IndyMedia developers gathered to
discuss their problems and produced a list of
requirements to a new open publishing engine.
-->
<p>Better design</p>
<!--
It's not hard to guess that the item that was stressed
the most was design and documentation of a new engine:
after 3 years of development, or, rather, patching in
new features here and there, Active code grown into a
sizeable ball of mud, and became difficult to
understand, and thus not likely to attract new
developers.
-->
<p>Internationalization</p>
<!--
Another problem with Active was poor support for
different languages: another the kind of problem that
you don't usually pay attention to until it bites you,
especially when English is your mother tongue.
-->
<p>Distributed storage</p>
<!--
From smaller and more obvious problems, to bigger and
more important ones, such as distributed storage.
First and most important application of open publishing
is political speech, you know, that inconvinient kind of
speech that often gets suppressed. I think now you can
see the problem: it is very easy to shut down one
central site, and much more difficult to track down and
shut down all members of a peer-to-peer network. Thus,
to resist suppression, open publishing has to go P2P.
-->
<p>Content categorization</p>
<!--
Finally, here are the reasons I have started the
Samizdat project: the need for content categorization
and open editing.
As I've said before, with growth of the open publishing
network, the amount of available information also grows.
To manage this flow reader should be able to match the
content against his areas of interest, and to narrow
down to the most relevant and high-quality information.
-->
<p>Open editing</p>
<!--
Now that is where we come back to the question of trust:
all readers should be given equal power in making
editorial decisions. That is, do be able to categorize
open content properly, editing process should be open as
well.
In the following slides, we will see how more advanced
open publishing engines fit these requirements. I
analyzed main directions of development in that area and
selected what I considered best engines of each kind.
If you've already looked at my slides, you may have
noticed that each engine that I present here is
implemented in yet another programming language. Believe
me, that is not intentional, rather a coincidence that
makes some sense if you think about it.
-->
</s:content>
</s:slide>
<s:slide>
<s:title>Engine: MirCode</s:title>
<s:content>
<!--
See, Active was coded in PHP, and this one is in Java.
Mir engine was developed by the collective of IndyMedia
Germany, and addressed some of the deficiencies of
Active. It doesn't attack more fundamental problems of
open publishing, instead it does the ordinary thing, but
does it really well, with all the bells and whistles.
-->
<p>Implemented in Java</p>
<p>Internationalization</p>
<!--
First of all, internationalization in Mir finally
allowed to convinently publish content in different
languages, and to localize site interface.
-->
<p>Static publishing</p>
<!--
Although Mir doesn't provide P2P publishing, it made one
step in that direction: static publishing. Mir produces
static HTML pages that can be easily mirrored and cached.
-->
<p>Media abstraction layer</p>
<!--
Mir also generalized Active's multimedia content
publishing into a media abstraction layer that allows to
add and handle different media formats.
-->
<p>Static categorization</p>
<!--
Another good thing about Mir is that it allows to
categorize content into topics, features, newswires, and
so on.
-->
<p>Dublin Core metadata</p>
<!--
And on top of it, it uses Dublin Core metadata for all
of this.
-->
</s:content>
</s:slide>
<s:slide>
<s:title>Engine: Scoop</s:title>
<s:content>
<p>Powers Rusty's Kuro5hin.org</p>
<!--
Another engine I'd like to draw your attention to is
Scoop that was originally developed for the
Kuro5hin.org. It doesn't have all the bells and whistles
of MirCode, it doesn't even call itself an open
publishing engine, but it has something that MirCode so
badly misses: open moderation.
-->
<p>Implemented in Perl</p>
<p>Focus on discussions</p>
<!--
Since Kuro5hin's original purpose was creation of online
community, the engine was focused on maintaining
high-quality discussions rather than on feeding
up-to-date news.
-->
<p>Excellent open moderation</p>
<!--
Idea behind Kuro5hin's open moderation is simple: site
users are allowed to select ratings for the comments
they like or dislike, comments with higher average
rating appear at the top and receive more attention from
readers. Simple as it is, this system produces very good
results, especially with large user communities.
-->
</s:content>
</s:slide>
<s:slide>
<s:title>Engine: Active2</s:title>
<s:content>
<!--
Now that I've described what I deem the most advanced of
existing IndyMedia engines and the most effective open
moderation system to date, the third engine I'd like to
mention is the Active2, the project that was started to
satisfy the requirements set forth by the IMC tech
meeting in 2002.
-->
<p>Young and ambitious project</p>
<!--
That is relatively young and very ambitious project that
hasn't yet reached its alpha release. Instead of
rounding up Active's rough edges, it attacks more
fundamental problems: distributed peer-to-peer storage
and open editing.
-->
<p>Implemented in Python</p>
<p>Heavy usage of frameworks (Crusader, Cheetah)</p>
<p>Distributed P2P sharing</p>
<!--
It is probably too early to say what will come out in
the end, but right now the main focus of the project
seems to be P2P.
-->
<p>Dynamic RDF metadata</p>
<!--
They are also paying attention to the content
categorization problem: they support dynamic RDF
metadata based on the Dublic Core.
-->
<p>Signed content</p>
<!--
All content in Active2 is supposed to be signed: this is
required both by P2P and by open editing considerations.
-->
</s:content>
</s:slide>
<s:slide>
<s:title>Engine: Samizdat</s:title>
<s:content>
<!--
Now that we've seen what is out there, you may have
already guessed why I decided to start a new project
instead of trying to enhance on of these. No ideas?
-->
<p>Implemented in Ruby</p>
<!--
Right! The reason is that none of the engines I've tried
was written in my language of choice, Ruby ;-)
In addition, each of them solves its own problem, at the
expense of everything else: MirCode successfully tries
to be a better Active than Active, but fails to become
something new; Scoop doesn't even try to apply its
excellent moderation to the area of open publishing;
Active2 project doesn't pay enough attention to the open
editing part of the story, and it managed to produce a
lot of code and even integrate some of those F-things
(frameworks) with almost non-existent design
documentation and has good chances of becoming
unmanageble before it gets finished.
-->
<p>Focus on clean abstract design</p>
<!--
And this last one is why I put the main focus on design
and documentation. Without documenting what is done and
what is to be done, you are acting blindly: you don't
know neither were you are, nor where you are going to.
-->
<p>RDF model for site structure and metadata</p>
<!--
I've been following on Semantic Web development for the
last two years, so at the time I've started the Samizdat
project, it was obvious to me that RDF provides
excellent basis for an open editing solution.
Open editing is about readers making statements about
resources, and making statements about resources is what
RDF is for. RDF allows anyone to make statements about
anything, without having to maintain referential
integrity.
-->
<p>Open editing via statement reification</p>
<!--
What is even better, RDF allows to make statements about
statements. That means, readers are not only allowed to
make editing decisions, such as categorizing content,
but also to make statements about how they agree or
disagree with editorial decisions made by others.
-->
<p>Relational RDF storage</p>
<!--
Once it was decided to build Samizdat model around RDF,
it was only reasonable to store all the site data as
RDF, and to access it with RDF queries: this way, one
more layer of complexity is gone.
Well, not completely gone, because I had to implement my
own RDF storage layer out of performance and flexibility
considerations, but at least, this layer is cleanly
separated from the rest of the Samizdat.
-->
<p>More than publishing: material items exchange</p>
<!--
If you start thinking about requirements for a site that
supports some community, you will soon come up with many
ideas beyond mere publishing: community exists not only
to exchange information, it also exists to cooperate on
common projects, and to share resources. The design of
the Material Items Exchange section of Samizdat shows
how it can be extended with such collaboration modules.
-->
</s:content>
</s:slide>
<s:slide>
<s:title>Samizdat design goals</s:title>
<s:content>
<p>Publish: open, multimedia, editing, aggregation, trust</p>
<p>Vote: visibility, content organization</p>
<p>Search and filter: quality, category, relation</p>
<p>Cooperate: calendar, material item exchange, time management</p>
<p>View: internationalization, accessibility, email interface</p>
<p>Develop: modular architecture, documentation, security</p>
</s:content>
</s:slide>
<s:slide>
<s:title>Samizdat concepts</s:title>
<s:content>
<p>Member</p>
<p>Message</p>
<p>Tag</p>
<p>Proposition and Vote</p>
<p>Aggregation</p>
<p>Item and Possession</p>
</s:content>
</s:slide>
<s:slide>
<s:title>Concept: Member</s:title>
<s:content>
<p>Add messages, tags, propositions and votes</p>
<p>View messages, use and publish filters</p>
<p>Associate with friends</p>
<p>Pool material items</p>
</s:content>
</s:slide>
<s:slide>
<s:title>Concept: Message</s:title>
<s:content>
<p>Basic unit of information</p>
<p>Subject of most metadata</p>
<p>Multimedia</p>
<p>Threaded</p>
</s:content>
</s:slide>
<s:slide>
<s:title>Concept: Tag</s:title>
<s:content>
<p>Content structure metadata glue</p>
<p>RDF statement (s::tag resource-uri tag-uri)</p>
<p>Tag is resource, any resource is a valid tag</p>
<p>Standard tags: Quality, Priority, Relevance</p>
<p>Custom tags: Fairness, Representation, Factuality,
Novelty, FrontPage, whatever...</p>
</s:content>
</s:slide>
<s:slide>
<s:title>Concept: Proposition and Vote</s:title>
<s:content>
<p>RDF statement that can be approved with votes</p>
<p>Tag rating, content clustering</p>
<p>Reification allows for meta-moderation</p>
<p>Pluggable rating and threshold calculation</p>
<p>Accountable and recallable</p>
</s:content>
</s:slide>
<s:slide>
<s:title>Concept: Aggregation</s:title>
<s:content>
<p>nextVersionOf: Correction, Rewrite, Summary, Translation,
Mirror</p>
<p>parts, partOf</p>
</s:content>
</s:slide>
<s:slide>
<s:title>Concept: Material Item Exchange</s:title>
<s:content>
<p>Item: description and instance</p>
<p>Possession: givenTo, possessor</p>
<p>Service items with no possession records</p>
</s:content>
</s:slide>
<s:slide>
<s:title>RDF storage</s:title>
<s:content>
<p>Extended Squish query language</p>
<p>Graph-to-relational translation layer</p>
<p>PostgreSQL RDBMS</p>
<p>Resource (id, published_date, literal, uriref, label)</p>
<p>Statement (id, subject, predicate, object)</p>
<p>Resource tables: Member, Message, Proposition, Vote, Item,
Possession, whatever...</p>
</s:content>
</s:slide>
<s:slide>
<s:title>Thank you</s:title>
<s:content>
<p>Questions?</p>
</s:content>
</s:slide>
</s:slideset>
|