Atom and RSS feeds are typically used to support syndication of
existing works, most commonly weblog entries. They are XML documents
that provide a common representation that can be consumed by feed
readers, unlike the HTML pages for such a work. ActivityStreams is
format for syndicating social activities around the web.
Based on the Atom Syndication Format, it tries to provide a feed for
activities, rather than existing works. This includes the
act of posting a blog entry, but can also express
activities typical for social networking sites, like adding friends,
liking something or affirming an RSVP for an
At Mediamatic Lab, we've recently gave notifications an overhaul. We had some code scattered around for sending notices to users, like when they received a friend request. We wanted to add a bunch of notifications so that people are aware of what happens in the network of sites, with their profile or works they've created. For example, when someone tags a person as being in a picture, it would be nice for that person to get a message about that. We also have a collection of RFID-enabled Interactive Installations that generate XMPP notifications for our backchannel system. I'll come back on this.
Whenever something happens that you want to send a notification for, there are a couple of things that you want to include in the notification: what happened, when it happened, and which persons and/or things are involved. The concepts of ActivityStreams turned out to coincide with how we wanted the notification to work. It abstracts activities in actors, objects and targets, along with a human-readable text to describe each activity.
A verb is an identifier for the kind of activity that has taken
place. A verb takes the form of a URI, much like
attributes in Atom link elements, or properties in RDF. The most basic
post, with the URI
An actor is the (usually) person that has performed the activity. Objects are the main persons or things that the activity was performed upon. For example, when I post a picture, I am the actor, and the picture is the object. A target is an object an action was done to. An example could be the photo album my uploaded picture was posted to. Actors, objects and targets usually have a URI, a title and an object type, similar to RDF classes.
Our new notification system does a couple of things whenever a an activity has taken place. It figures out the verb, actor, object and possibly the target and then creates a notification entry. It then calculates the parties that should get this notification in their inbox. This is usually the actor and the owner of the object. A person's profile is always owned by itself, so when the object is a person, that person would get a notification on things happening to them. When a party is not local (i.e. on another site in the federation), the notification is sent to the other site to be processed there. Each person's inbox is viewable as a stream of activities, much like Jaiku or Facebook, and is also published as an ActivityStreams feed (e.g. ralphm's activities. New notifications can then be processed by other modules.
One of them is the Message module, that sends out e-mails for notifications, according to personal preferences. For now, you can choose what kind of notifications you want to receive an e-mail about, by choosing the verbs you are interested in. Examples currently include: friend requests/confirmations, changes to things you own, people liking, linking to, RSVPing (for events), sharing (to Twitter, Facebook, etc) or commenting on things you own, and people tagging you in a picture.
Another module is the Publish-Subscribe module, that provides XMPP Publish-Subscribe nodes for each person, along with a node for all notifications for that site. This allows for applications that use the stream of activities for a person or the whole site, in near-real-time. An example could be a mobile app to track activity for you and/or your friends, or IM notifications much like Identi.ca or Jaiku.
Another possibility is a backchannel. We developed a backchannel system for events we deploy our RFID-enabled Interactive Installations at. A backchannel feed is aggregated from a configurable set of sources, of which the incoming items are formatted into notifications to be put up on a live stream. Every time someone takes a (group) picture with our ikCam, the image is posted on the backchannel, along with a text listing the people in the picture.
On top of that, we also can include tweets by tracking particular keywords and people. We use (and improved) twitty-twister to interact with Twitter's Streaming API from Twisted. I've recently changed the streaming code of twisty-twitter to consume JSON instead of the deprecated XML (with a bunch of tests), and a way to detect unresponsive connections.
With activities now also available as XMPP notifications, the logical step was to consume these for the backchannel as well. We have an office backchannel on a big screen that tracks Twitter for keywords related to Mediamatic and its events, and the notifications from our interactive installations. It now also includes activity on our sites, and this turns out to be a great way to see everything happening in our sites.
So, did it all go smoothly? No. We found quite some things in the ActivityStreams' concepts in combination with anyMeta and our interactive installations that we didn't expect when we started the project.
One of the big ones was Agents. Our interactive installations have
their own user accounts to take pictures, process votes, etc. These
accounts have special privileges to perform actions like making all
people in an ikCam picture contacts in the network. We also have a
Physical I-like-it button, which is an RFID reader placed next to a
physical object (e.g. a painting) that has a representation in one of
the sites. When reading a tag, it creates a like
relationship between the holder of the RFID tag and the object. When we
just enabled the first enabled the new notifications functionality, a
message popped up on the backchannel:
ikPoll Agent likes
That was quite unexpected but quite logical when we thought about it. ikPoll Agent is the user account for the I-like-it button, that is powered by the same software as our more generic ikPoll installations. We defaulted the actor of an activity to the user account performing the action. Although the agent creates a link from the person to the object, the link was not created on behalf of the physical user. So we needed to introduce the concept of Agents, and have that also stored and communicated along with activities. The same action would now yield an entry with the title 'ralphm likes iTea (via ikPoll Agent).
Another was pictures taken with the ikCam. Besides posting the
image, all actors are tagged in the picture, the picture is optionally
linked to an event and a location. This yields a bunch of
notifications, where we would like to have only one:
a self-portrait. We have started work on compound activities
that would have the enclosed activities linked to it and back, a bit
like the Atom
Threading Extension.. This would allow aggregators like the
backchannel only show the umbrella notification.
A final one was our verb
link. This was supposed
to be a catch-all verb for the activity of creating a semantic link
between two things, of which the predicate didn't already have its own
verb (like friending, liking, etc.). It now looks like having a
notification like 'person A linked to thing B' might need some more
information. An e-mail notification at least has the links to
respective pages, but that doesn't quite work on a backchannel beamed
on a big screen. For now we ignore such notifications for the
backchannel, until we have a better solution. It might be that we need
to include the link's predicate in the notification, or make links
themselves first-class citizens (with their own URI).
Going to FOSDEM and/or the 10th XMPP Summit in Brussels? I'll be talking about this and other topics in my talk on Federating Social Networks on Saturday 5 February.
Last week, Blaine Cook congratulated me on Idavoll being in Apple Mac OS 10.6 Server, as its Notification Server. I did have contact with Apple's server team ages ago, about them using Idavoll and having added some customizatons, but never knew where it ended up. The list of Open Source projects used in Apple's products confirms the use of Idavoll, and Wokkel, too, as a dependency of Idavoll. Cool!
Idavoll, and thus Notification Server, is a generic XMPP publish-subscribe service, in Python with Twisted. Upon inspection of the code and the differences against the mentioned versions, most of the customizations match those I was already aware of: an SQLite backend, the whitelist node access model and associated member affiliations. The link to Notification Server at the open source list goes nowhere (yet), so I am unsure about the actual license of their additions. I contacted the server team, and will write again if I have more news on this.
At the nice post by Jack Moffitt on Apple's use of XMPP, Kael mentions the presence of more Publish-Subscribe goodness in Calendar Server. This is actually the stuff that uses Notification Server for push notification in iCal. As Jack says, it is truly great to see large corporations like Apple to embrace XMPP like this. I really wish Google Calendar had a similar feature. Now I only get meeting invites through e-mail. Apple's particular use of Publish-Subscribe reminds me of Joe Hildebrand's effort on WebDAV notifications, and I think that there are a lot of applications that could benefit from such push features.
As I touched upon earlier, at Mediamatic Lab, we use XMPP Publish-Subscribe for exchanging things for federation. But we've also built a bunch of interactive installations, most of them dealing with RFID tags we call ikTags. To name two examples, the ikCam takes a (group) picture, uploads it and friends the depicted persons by reading their tags. The ikPoll is a polling station where people can 'vote' on questions with the tag. Typically, there are also publish-subscribe notifications coming out of those interactions, so you can create a live stream of things happening at an event like PICNIC. Combined with the Twitter Streaming API and our own status messages, this creates an entertaining back channel, coincidently powered by Idavoll.
Two exciting projects I've been recently working on at Mediamatic Lab are two highly connected sites around the Jewish Community in the Netherlands during World War Ⅱ. The first is one of the oldest sites we have made, the Digital Monument. This site contains verified information on all of the Dutch jews that have died during WWⅡ along with their households, documented posessions and known documents and pictures. It is maintained by a team of editors of the Jewish Historical Museum.
The second is a brand new community site, to complement the Monument by allowing anyone to add new information, pictures and stories on people at the Monument.
The Monument is very impressive, as I learned back at the first BarCamp Amsterdam, hosted by Mediamatic. You will know what I mean if you spend a little as five minutes paging through the site. Today, however, I want to talk about the technology behind both sites.
The data in the Monument is highly semantic in nature. People are part of households, as head-of-family, spouse, son or daughter. Or some other relation. Households have a location and lists of possessions. Tied to all of these are supporting documents and pictures. In anyMeta, all of these are modeled as things with edges between them with a certain predicate. A typical household would be modelled like this:
For the community site, however, we wanted to have more direct relationships between people: parent-child relations, sibling relations, partner relations and a more generic (extended) family relationship. As the community also has most things of the monument imported, this meant a change in the data model and a subsequent conversion in the monument.
In anyMeta, (almost) everything is a thing. As such, the predicate on an edge between two things is also represented by a thing. This has traditionally been named role. Like all things in an anyMeta site have a resource URI, the resource URI of a role is the predicate's URI. We try to use existing (RDF) vocabularies as much as possible for this.
For relationships between people, we've used the
used in FOAF. So this was the first place to look for the desired new
predicates. However, this vocabulary does not have a property for
expressing a generic extended family relationship. Fortunately,
XFN has the
kin relationship type, along with
Richard Cyganiak described how to express XFN relations in
RDF, so we used that to base our predicates on.
Like RELATIONSHIP, most of the XFN properties are subproperties
foaf:knows property, and have some
hierarchy themselves, too. In anyMeta, we didn't have the concept of
subproperties, yet, so we added a new role for expressing subproperty
relationships between roles, and introduced the concept of
implicit edges. These are edges with a
superpredicate of the explicit edge that is being created. For
xfn:child property is a subproperty
foaf:knows. Whenever an edge between two people
gets created with the child role, another implicit one with the knows
role is added, too.
After conversion and with the implicit edges present, the new data model of the example above looks like this:
The blue arrows are the new, derived edges. A spouse edge is made between those people that respectively have a head-of-family and partner relation to the same household (this can be assumed to be correct for this dataset). For person that have a son or daugther edge to a household, a child edge is made from the head-of-family and partner persons (if any) in that household to this person. We haven't (yet) added derived sibling edges, as this relation depends on the parents of both persons, too.
You can also see gray, dashed edges. These are the implicit edges that follow from the property hierarchy. Another thing to notice, is that the biographies are gone. We put the texts in there directly on the persons and households, instead.
Besides the regular pages of all people, households and other things, you can also use our semantic browser to look at the relationships between things. For example, Mozes and his family can be browsed from here.
Even before I got to work for Mediamatic Lab, Mediamatic was using Twisted. My friend Andy Smith used it for a bunch of projects around physical objects, usually involving some kind of RF tags. Examples include the Symbolic Table and the Friend Drinking Station. From this grew fizzjik, a Twisted based library that implements support for several kinds of RFID readers, network monitoring and access to online services like Flickr and of course anyMeta.
On the other hand, I have dabbled in Twisted for quite a while now, mostly contributing XMPP support in Twisted Words and through the playground that is known as Wokkel. But why go through all that effort, while there are a several different Python-based XMPP implementations out there? And why does Mediamatic use Twisted? Why do I believe Twisted is awesome?
First of all, we like Python. It is a great little language with
extensive library support (
batteries included), where
everything is an object. Much like in anyMeta. It is a language for
learning to program, to code small utility scripts, but also for entire
But going beyond that, building applications that interact with different network protocols and many connections all at the same time is a different story. Many approach such a challenge by using preemtive threading. Threads are hard. Really hard. And Python has the GIL, allowing the interpreter to only execute byte codes in one thread at a time.
So in comes Twisted. Twisted is a framework for building networked applications in Python, through a concept known as cooperative multitasking. It uses an event loop that hands off processing of events (like incoming data on a socket or a timer going off) to non-blocking functions. Events loops are mostly known from GUI toolkits like GTK, and so Twisted goes even beyond networking by working with such toolkits' event loops, too. As most network protocol implementations only have a synchronous interface (i.e. one that blocks), Twisted includes asynchronous implementations of a long list of network protocols. For the blocking interfaces that come from C libraries, like databases, Twisted provides a way to work with their threads, while keeping all your controlling code in the main thread. Asynchronous programming does take some getting used to, hence Twisted's name.
So how do we use Twisted? Well, a recent application is our recent RFID polling system. It allows people to use their ikTag (or any card or other object with a Mifare tag), tied to their user account on an anyMeta site, to take part in a poll by having their tag read at an RFID reader corresponding to a possible answer. The implementation involves:
Communication with one or more RFID readers (that also have output capabilities for hooking up lamps, for example).
Communication over HTTP to access anyMeta's API to store and process votes.
Network availability monitoring (can we access the network and specifically our anyMeta site?)
Power management monitoring (do we still have power?)
Device monitoring. Are RFID readers plugged in or not? Which device handle are they tied to? Also watch coming or leaving devices.
A (GTK-based) diagnostic GUI
Additionally, we also want to show polling results, so we have a browser talking to a local HTTP server and a listener for XMPP publish-subscribe notifications.
This is quite a list of tasks for something as seemingly simple as a polling stations. But wait: there can be multiple readers tied to a particular poll answer, likely physically apart, a polling question can have maybe 50 answers (depending on the type of poll, like choosing from a collection of keywords) or there could be a lot of questions at one event.
So, back to Twisted. Twisted has HTTP and XMPP protocol support (both client and server-side), can talk to serial devices (like your Arduino board) and DBus (for watching NetworkManager and device events) and provides event loop integration with GTK to also process GUI events and manipulate widgets based on events. Together with Wokkel, it powers the exchange of information in our (and your?) federating social networking sites. In Python. No threads and associated locking. In rediculously small amounts of code. That's why.
Not yet convinced? Add a Manhole to your application server, SSH into it, and get an interactive, syntax highlighted Python prompt with live objects from your application. Yes, really.
I am attending XMPP Summit #7 and part of OSCON 2009, with which it is co-located due the kind folks at O'Reilly. Much like last year, only this time in San José, California. Unlike the European version of the summit last February, we hope to focus more on doing than talking, although there will be plenty of that, of course.
Suggestions were made to do some interoperability testing, along with general hacking sessions. I am bringing my implementation of server-to-server dialback, and a bunch of other protocol implementations in Wokkel to the table. While there are a bunch of other protocol implementations in Python, I think the Twisted approach is so different that I want people to know about the ideas behind it. By introducting them to Twisted through Wokkel should give them at least a glimpse of why I believe Twisted is awesome.
So, nearing the summit I prepared a bunch of examples around the XMPP Ping protocol, as I mentioned before. Additionally I prepared an example echo bot on steroids, which is basically a stand-alone XMPP server that connects to other servers using the server-to-server protocol. It will accept presence subscriptions to any potential account at the configured domain, sending presence and echoing all incoming messages.
Besides the hacking sessions, I planning to discuss publish-subscribe delete-with-redirect, node collections, publish-subscribe in multi-user chats and service discovery meta data. Oh, and we might go on a field trip to discuss Google Wave XMPP-based federation protocols. Then, after the summit, I will hanging out at OSCON until Thursday, for hallway meet-ups on federating social networks with protocols like OpenID, OAuth and technologies like webfinger and pubsubhubbub. I also brought an RFID reader to play with.
Today's Wokkel 0.6.2 release is to show case some of the features in the previous 0.6.0 release. Most of the work was part of the things we have been building at Mediamatic Lab as part of a restructuring of how we federate our social networking sites using publish-subscribe.
First of all, I added a preliminary, but functional, implementation of server-to-server support, using the dialback protocol. This complements the router code that went into 0.5.0 and Twisted Words 8.2.0 to make a fully stand-alone XMPP server. Note that it does not implement any client-to-server functionality yet, but this can be added as separate server-side components now.
To show this off, I have created a bunch of examples around the XMPP Ping protocol, for which the protocol implementation itself is also a nice example of how to write XMPP protocol implementations using Twisted Words and Wokkel. Be sure to check out these examples.
The other feature I want to mention is publish-subscribe Resources. They provide an abstraction of (part of) a publish-subscribe service. The protocol parts are handled by Wokkel. This should make it easier to do node-as-code scenarios, by just filling in the blanks of the various methods that are called upon receiving requests from pubsub clients. I'll create some examples for this shortly.
PubSubHubbub is a protocol and reference implementation for doing publish-subscribe using web hooks, polling in feeds triggered by a ping from the publisher, and POSTing Atom entries to notify subscribers. The notification part is similar to what I've been working on for the publish-subscribe stuff at Mediamatic Lab, where we spiced up Idavoll with an HTTP interface to bridge the gap between XMPP Publish-Subscribe and HTTP speaking entities.
Although I spend a lot of time working on XMPP based publish-subscribe, I understand the reasons for going for a full HTTP-based approach. XMPP can be intimidating for developers of web applications. While the differences between XMPP and HTTP are important (stateful connections, asynchronous processing, etc), the fact that it is different is reason often enough. Hosting facilities don't always offer ways to do XMPP, and there is not nearly enough running code out there to make it easier for people to play with these technologies to spice up their web application with non-IM XMPP functionality. Having platforms like Google App Engine provide sending and handling raw XMPP stanzas as part of the API would surely help.
That said, PubSubHubbub has two separate sides to it, the publishing part and the notification part. There's nothing that prevents a hub to do the publishing part using regular XMPP publish-subscribe. Instead of fetching the Atom Feed over HTTP every time, it could use autodiscovery to find out the publish-subscribe node and upgrade by subscribing to it instead. Similarly, the notification part could send out XMPP notifications. Combined with existing HTTP aggregator, that combination is very similar to how the aggregator for Mimír works.
I'm still not convinced that PubSubHubbub is the answer to the efficient exchange of updates on social objects, but I do think it is a good way to make smaller entities be part of a federation of social networking sites. Likely, we'll see a hybrid approach, to begin with.
Last month I was fortunate enough to attend Social Web FooCamp at O'Reilly HQ in Sebastopol, CA, a follow up to Social Graph FooCamp in 2008. I can't express how inspiring such events are, being able to have a continuous, in-depth conversation with so many bright minds about so many topics that keep you busy on regular days, and more. I'll give a quick overview of the whole trip, and then go into depth in a series of posts.
My trip started with a visit to friend and former Jaiku colleague Andy Smith, who was kind enough to take me in at Houseku. As soon as I landed on SFO, I got an SMS from him to make a detour to his office. Besides meeting a bunch of Andy's fellow googlers, I got to spend some time with Brett Slatkin talking about PubSubHubbub.
The next day I got a ride to Sebastopol from Edwin Aoki. After a trip full of interesting conversation, we arrived at the O'Reilly offices. Sebastopol was a lot warmer than San Francisco, perfect for camping. Lots of familiar faces, but also a lot of new ones. During the Friday evening, apart from the general introduction, I didn't get to any sessions, but instead spent talking to a bunch of people on XMPP, Publish-Subscribe and the work I am doing on federating social networks under that name Open-CI at Mediamatic Lab.
The next two days were filled with sessions and hallway talk on OpenID, OAuth, different approaches to Publish-Subscribe and inter-site communication, resource and service discovery and service scalability. While most of the topics were similar to last year, I was glad to share what we've done at Mediamatic Lab over the past year, while learning how others have fared. We used these technologies to make a true federation of social networking sites where you can make cross-site relations between people and their social objects. Some of our discoveries there we're shared among the participants, while others had interesting other approaches.
Especially interesting to me was a session on OAuth and OpenID where I could explain how we tried to improve upon the user experience. Both technologies have a bad reputation in this area. With some smart defaults and trust between sites, we could eliminate some of the screens. There was talk about using pop-ups in some situations, either as lightboxes or as new (small) windows. In our experience the former can't be used if you want to do SSL (since you can't validate the address and certificate). The latter was deemed confusing in our user tests. Research is still ongoing, I suppose. The other issue had to do with presenting OpenID providers. We currently use a drop down, but that doesn't scale up very nicely. Logos might work, but in the end has the same issue.
I also got to show Blaine Cook the code I wrote recently to make it easier to write XMPP publish-subscribe enabled services (code-as-a-node), that has been included in the recent Wokkel release. In turn, Blaine shared his thoughts on simple addressing on the web and we got to hash it out with a bunch of people like Brad Fitzpatrick, who also organized the pubsub shootout session. Finally, Eran Hammer-Lahav showed his work on XRD.
I'm pretty sure I forgot to mention a lot of things, but when it comes back to me, I'll write about it some other time.
In part 1 I wrote about what you can subscribe to and how a social network service will send out notifications. I often used node as the thing you subscribe to, a term comes directly from the XMPP Publish Subscribe specification. In other publish-subscribe implementations this is often referred to as topic. Nodes are kept by a publish-subscribe service, and, among other things, this service is responsible for keeping the list of subscribers and sending out notifications.
Publish-subscribe services currently come in two forms: dedicated
publish-subscribe services with their own domain (e.g.
pubsub.ik.nu) and publish-subscribe services tied
to a user account (often mentioned in combination with the Personal Eventing
Protocol, also known as PEP). In the latter case, nodes are
kept at the bare JID of a user's account (e.g.
firstname.lastname@example.org. Personal pubsub-nodes have nice
properties, like the ability to directly associate a particular node
with a person, and the possibility of doing access control on the
user's contact list (roster).
In the context of federating social networks, a service needs to decide where to put the nodes it wants to allow other entities to subscribe to and send out notifications from. In some cases it makes sense to keep nodes at user accounts, though in some other cases it is better to provide the nodes at the domain of the service itself. This depends on the nature of the social objects and the subscribable unit you provide. Let's explore some use cases.
In Jaiku, social objects (microblog posts and aggregated items like photos, bookmarks, etc), are organized in streams. Streams are tied to either a user, or a channel, and don't change ownership. The social objects themselves are static, once created, they cannot be edited. They can have comments associated with them, but those also cannot be edited. The only thing that can happen to streams, stream items, and comments is deletion.
Here, it makes sense to have a node for each stream, and
possibly a stream for the comments to each stream item. Those can
be tied to the owner's JID (e.g.
#email@example.com). Another possible node could
be: all comments by a person. Another node an entity might want to
subscribe to is: all public microblog posts. Such a node would be
associated with the domain of the service rather than any
particular user's JID.
The company I work for, Mediamatic Lab has a (proprietary) CMS called anyMeta. Instead of 'content', the C in CMS here stands for Community, to highlight the social network properties it provides. anyMeta is a highly semantic system that deals in things (a person, an article, an event, a blog), and edges (the relations between things, each with a predicate like friend-of, author-of, etc). I mainly work on federating instances of anyMeta.
Things in anyMeta are usually editable, so it makes sense to want to keep informed about changes. For example, an article can have a large number of edits, and a person might move, change employers or have other changes to his profile. Thus, we chose to at least provide each thing as a subscribable unit. Upon creating a thing, a new node is created, and a representation of the thing is published to the node. Editing a thing, results in subsequent publishes. Subscribers will receive notifications as the node gets published to.
We organized the nodes in a flat namespace, tied to a domain, rather than a user. One reason is that the owner of any particular thing might change. Tying a node to the first owner, and then needing to move it when the owner changes, is cumbersome.
Each node has an identifier that is unique within the
publish-subscribe service holding them. So you could have two nodes
updates tied to two different users. Node
identifiers are opaque; one should not derive meaning from how the
node identifier looks. Embedded slashes might suggest some
hierarchy, for example, but an application should not assume that
such a hierarchy actually exists.
That said, it makes perfect sense to use logical, human readable identifiers for nodes. They might even be very similar to the URI layout of the service's web site. Let's check what one could do for the examples given above.
It makes sense to have the node identifier for the regular
posts (called presence) be
presence and the
nodes for the individual posts (with comments)
presence/123456, where the number is the same
as used in the web page for that post. Those two examples could be
tied to a JID representing me at Jaiku:
The node for all public posts could be called
explore and located at the JID of the whole
jaiku.com. This would be similar to
the web site, where all public posts can be viewed at http://jaiku.com/explore.
It might also make sense to have a dedicated node for a user's
profile information, that can be retrieved and presented at a
service or application that consumes the social object updates. At
least a (full) name and some icon or headshot would be nice to
have there. Obviously, subscribing to such a node would mean that
future profile changes will also propagate to the consuming
entities. An example identifier would be
profile, to be kept at the user's JID.
In anyMeta, each thing has an identifier, that could be used for the node identifier as well. However, in the current implementation, all nodes are held by a loosely coupled, generic publish-subscribe service that caters multiple anyMeta instances. We chose to use unique identifiers as generated by the publish-subscribe service, which don't have any relation with the thing identifier.
As you might have guessed, some of the stuff being discussed
here has already been implemented in anyMeta. The
publish-subscribe service used is Idavoll. It has grown an
interface that is used (internally) to create new nodes,
publish items that represent things, and subscribe to, and receive
notifications from, remote publish-subscribe nodes. The thing that
Mediamatic profile is represented by the node
pubsub.mediamatic.nl. All things in this site,
but also the PICNIC
site, have nodes like this. In a future post I will
explore what we do with these nodes.
In this part, we explored how one could organize the nodes that entities can subscribe to to get updates. Some might be tied to the (virtual) JID of the user's account, or associated with the JID of the service itself. Node identifiers might be human guessable, and like the web URIs, or could be seemingly random opaque strings. Implementations that consume subscribe to, and consume notifications from, the nodes at social networking services, should not assume anything about the organization and naming of the providing service. This presents a challenge for the next episode: how does one know which nodes are there and what they are called? So, up next: discovery. Homework assignment: look carefully at the HTML of my Mediamatic profile page.
The use of XMPP publish-subscribe in federation and third-party applications deviates a bit from the standard use-case. Usually publishing, subscribing and receiving notifications happen through the same protocol on specific (leaf) nodes. Entities subscribe to a node that represents a particular thing they are interested in getting updates for, and when an item is published to that node, these subscribers will receive a notification for that item.
For federating social networks, the focus is on the exchange of updates on social objects or comments between services. For third-party applications, the most important thing is getting updates, preferably as soon as possible. So, for both of those use cases, receiving notifications through XMPP gives it an edge over HTTP: no polling, lower latency, less connections.
How these items are published, does not really matter that much. What you will typically see is that services somehow have a new item available (submission via the web, SMS, e-mail or a web-based API) and want to expose that through XMPP. Posting a new update through XMPP from a third-party client usually does not provide an advantage over existing web-based APIs.
For a service like Jaiku, Twitter or Identi.ca to provide XMPP publish-subscribe support, it is important to define the subscribable unit and provide that as a node. Such a node will usually not be published to directly, but is more of an aggregate node. Examples would be: all updates by a particular user, all updates in particular channel, all updates by a user and his contacts, all public updates. An other example could be: all comments on a particular social object.
Conceptually, all such aggregate nodes are internally
subscribed to a particular subset of new and updated social objects
and comments. You might even implement it exactly like that. Think
of a prospective search that is captured by a node: every time a
new item comes into the service, it is determined which of the
provided nodes would be a match for this item, based on author,
contact lists and permissions. Subsequently, for all of those
nodes, a notification will be sent out to its subscribers. Telling
items apart in this scenario is then likely not done using the
service JID, node identifier of item identifier, but using some
identifier in the payload, like Atom's
element, although those other identifiers might provide a
For those familiar with the concept of XMPP publish-subscribe collection nodes: those would be a special form of aggregate nodes that make it explicit what their relationship to the nodes they aggregate items for is.