Simon’s Backup Weblog


Failing to get a date on the web

Posted in Uncategorized by Simon Bisson on January 31, 2006

If you run a search engine, here’s an idea for free.

I like being able to search the web – for web pages, for map information, for local information. It’s become a default action for me, I keep wireless PCs in all the rooms of the house, and my phone has Google bookmarked and ready to use wherever I am. I can concoct search queries that get me to the information I want quickly and easily, avoiding paid search results and catalogue aggregation sites. Boolean search logic is just another language now. However, sometimes I’d like to search by date.

This is where the utopia of the search engine suddenly comes to a screaming halt.

Yesterday and I were planning a US trip (which is starting to look like it will fill most of March). As we were using the British Airways sale to book tickets, flights were limited – and the longer we stayed, the better the deal. So I tried to see if there were any interesting events we could visit whilst we were over.

And that’s where everything fell apart.

Could I get a search term to let me know if there where any conferences we could visit? No matter how I tweaked the search terms I kept getting the same list that I had to page through to track down the information I needed. I couldn’t sort it, filter it, or even tune the search.

The trouble is: search engines don’t really like you trying to look for date ranges. A search term like “technology conference march 2006” doesn’t really mean anything to Google. You’ll get lots of results, but nothing to help you order results by date or by location. Date-based searching is as important as location-based search (and the two combine together very well indeed). As more and more people use, and rely on, search engines, the queries they use will become more and more complex.

Of course there are issues with the semantics of date (let alone with date formatting). Are you asking for web pages created on a certain date, or containing information about that date? Are you specifying a date range, or are you looking for a specific date? These are complex questions – and I wonder if they’re the reason why the oft-rumoured Google Calendar is yet to appear.

Microformats are one approach that could help here. Technorati has defined hCalendar as a web page-embedded XML-tagged equivalent for the familiar iCalendar format, which would allow search engines to build arrays of calendar data for search that could be linked to web content. Alternatively iCalendar-driven calendars are appearing all over the web. Apple’s iCal uses the format to publish web calendars to .Mac , as does Outlook and the new calendar tool in Windows Vista – and there are plenty of open source iCalendar servers, as well as desktop calendaring applications that can handle iCalendar data.

What’s to stop these tools being used to ping a date registry when calendar information is posted in a public space?

So, Apple, Microsoft, Technorati, Google, FAST, Yahoo! and MSN, get together and give us time-based search. The UI doesn’t really matter (though a calendar grid would work really well for drill down), as long as we can sort results by date…

Advertisements

13 Responses to 'Failing to get a date on the web'

Subscribe to comments with RSS or TrackBack to 'Failing to get a date on the web'.

  1. andrewducker said,

    Actually, Google could force this one easily – if they said “Make date data available in format X and we’ll use it.” then everyone else would fall quickly into line…

  2. marypcb said,

    Google Events – a logical extension for Google movie times?

    http://upcoming.org/
    Upcoming seems to be a kind of flickr for events – down to being snapped up by Yahoo. I wonder if date is going to be 2006’s location? I don’t know if their event badges are using a microformat… But there’s no Date view: while you can see what’s popular, what’s on in SF or LA, what your friends have organised, you can’t see what’s on Tuesday next until you drill down to a specific location.

    And the step beyond – being able to get things into my local calendar from a search other than copy, create event, paste!

  3. quercus said,

    Google already says this, for many basic search and metadata publishing features. Yet the site builders, including some very big and serious ones, just don’t bother to pay attention. This is why it’s still very easy to cream SEO results if you have the faintest smattering of Clue.

    As to the date formats, then they’re out there. Been out there for years. Doesn’t need anything so tightly application-bound as vCal or iCalendar either (and if you think XML is the solution, then you’ve missed the point).

    This whole issue (technically) is so 2001 8-( We fixed this one ages ago – start using it, guys.

  4. andrewducker said,

    Can you point me at a further resource for this? What data format should be being used here, and where is it defined?

  5. razorsmile said,

    Of course there are issues with the semantics of date (let alone with date formatting). Are you asking for web pages created on a certain date, or containing information about that date? Are you specifying a date range, or are you looking for a specific date?

    Both, ideally. I posted on this sometime last year

    What’s to stop these tools being used to ping a date registry when calendar information is posted in a public space?

    An online counterpart to the atomic clock?

    So, Apple, Microsoft, Technorati, Google, FAST, Yahoo! and MSN, get together and give us time-based search. The UI doesn’t really matter (though a calendar grid would work really well for drill down), as long as we can sort results by date…

    Hear hear.

  6. miramon said,

    I’ve tried doing date-based searches in Google, I reckon people update their ancient items just so they show up as recent. It’s not just the search engines though. I’d really like to see a version of RDF that allows you to define a date-dependent resource. As it is, you can make a semantic connection in RDF, but it’s assumed to be valid for ever (which is often not the case in the real world). I’ve come to the conclusion that nobody on the internet ever thinks about time as something that is worth paying attention to.

  7. nmg said,

    The best bet for an unambiguous date format that’s easily parsed by search engines is that defined in ISO8601. Current best practice seems to be to stick to the subset identified in this W3C note.

  8. nmg said,

    This is a bit of a blind spot for the Semantic Web at present, mostly because there’s no easy way of layering a (modal) temporal logic on top of RDF/OWL while still keeping the favourable complexity of subsumption reasoning within the description logic that underlies OWL. There’s a long history of temporal reasoning within the AI and knowledge engineering communities (the lineal ancestors of the SW movement), but the standards process for RDF and other SW languages has been dominated by the point of view that holds DL subsumption reasoning to be the most important type of reasoning on the nascent SW.actu

    As far as I’m aware, most SW researchers who are actually addressing this problem are doing much the same as we are (we being my my research group in Southampton); we have a simple ontology (based in part around the dc:coverage predicate) that allows us to define the period during which the triples in a given RDF graph are valid. Graph-level assertions, which are usually made by making statements about the RDF/XML file that contains the serialisation of the graph, are generally thought preferable to triple-level assertions (in which RDF reification is used to provide a target for the validity assertions).

  9. del_c said,

    I would at least like to be able to constrain searches according to how old the page is (creation or update, I’d be grateful for anything at this stage). Google advanced search lets you select web pages updated in the last 3 or 6 months prior to the search, but I never saw that selecting that box had any noticeable effect. Nor can I tell if tweaking the URL gets anywhere. The term as_qdr=m3 or as_qdr=m6 appears in the URL, but I can’t get it to do anything interesting.

  10. andrewducker said,

    But it’s not time formats we’re talking about here – it’s event formats. There’s no point having a list of various dates and the events that are happening on each of them, unless there’s enough syntax there to be able to parse which event is when, and where.

  11. sbisson said,

    Hence hCal.

    It’s also interesting to note how Upcoming handles events.

  12. stillcarl said,

    The trouble is: search engines don’t really like you trying to look for date ranges. A search term like “technology conference march 2006” doesn’t really mean anything to Google. You’ll get lots of results, but nothing to help you order results by date or by location.

    An alternative approach is to search for a search engine that might do what you want. For instance, “conference search engine” turned up…

    http://www.allconferences.com/

    Now if only its advanced search didn’t seem to behave in a random manner… 🙂

  13. nmg said,

    My mistake – I misread data for date.

    The most common format for event data seems to be iCal, but that has the disadvantage of being none too pleasant to parse. There’s an RDF schema based on iCal which suffers the same conceptual problems as iCal, but with the added advantage/disadvantage of RDF syntax, depending on who you talk to (I’m firmly in the pro-RDF camp, and even I think that RDF-calendar is a bit awkward). Failing that, there are the microformats such as hCal, which again is essentially another syntax for iCal, but one which has been shoehorned into XHTML.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: