Tuesday, May 14, 2013

Realtime Big Data

In a post about architecting realtime big data systems, James Kinley splits things up into 3 layers: batch, serving and speed.



The architecture has merit even without the specific technologies he employs at each layer. Especially useful is the combining of batch and real-time views to a best-of-both-worlds query capability. Without an approach like this, a decision has to be made to either commit too soon or too late to what aggregation of data will be used.

It is a speed vs. flexibility problem. Commit too soon, and you will have a fast system that doesn't give you much query flexibility. Commit too late and you will have a flexible, slow system.

Kinley's architecture shows that you can have flexibility without giving up speed.

Wednesday, September 15, 2010

An outline of "Social Semantics" topics

A call for papers from DERI contains a good outline of topics relevant to real-time and ubiquitous Social Semantics:

  • From raw social data to semantic data
    • semantic grounding of raw social data
    • generation and aggregation of social semantics
    • ontologies and data models for social data representation and analysis
    • real-time semantic mining and analysis of social data
    • trends and dynamics in social semantic web
    • capturing and representing context in social networking
  • Ubiquitous Web and social semantics
    • integration of virtual and physical worlds
    • integration tools, technologies, and platforms
    • privacy, ethics, and confidentiality
    • presence tracking and semantic augmentation
    • semantic sensors and RFID
    • social semantics on mobile devices
  • Real-time querying frameworks and languages for social data
    • stream querying and reasoning on social data
    • location or time based reasoning, context based reasoning
    • querying volatile, moving and dynamic networks and data sources
    • dynamics, changesets and push-based notifications
    • scalability, approximate reasoning and querying in social applications
    • provenance and quality for querying social data

Anything you'd like to add to that list? Which ones do you see as the most important?

Sunday, September 5, 2010

The Social Semantic Web and SIOC

The Social Web is contributing to an acceleration of content creation that once again threatens to overwhelm us. Semantic Web technologies offer a way to greatly increase the value of this content while relieving the overwhelm.

For example, Twitter users often stop following people when they can't keep up with reading their streams. Reading streams directly is a first-generation activity; soon our applications will help read and organize the streams for us. Think Google Priority Inbox for tweets.

Sorting tweets is only the beginning. As our applications better understand the content of the Social Web, they will automatically link content relevant to our current context. Think Google AdSense for your to do list, where the content of the "ads" is advice and links from your network related to what you're working on right now. This is an aspect of what I call Social Tasking.

One part of the solution involves the use of ontologies, and one of the ontologies that may help is SIOC: Semantically-Interlinked Online Communities. The vision of the SIOC project is that...

combining Semantic Web technologies and social media paradigms will lead to "Social Semantic Information Spaces", where information is socially created and maintained as well as being interlinked and machine-understandable, leading to new ways to discover information on the Web.

To enable a model for interoperability and portability among social data services, SIOC reuses portions of FOAF (friend of a friend) and Dublin Core ontologies.

The SIOC project models social data on the Web using semantic technologies.

Interoperability among services is a start. Other semantic technologies are being developed to resolve the meaning of tags and to put more control in the hands of users. More on that later.

Saturday, September 4, 2010

One Four One

I needed a place for posting ideas or communications that are longer than 140 characters. So, this blog is named after the character that lives just across the boundary from microblog to blog -- character 141.