Category Archives: tech

General technology issues

WWW2006 Workshop – Logging Traces of Web Activity

I am one of the organizers for the WWW2006 Workshop – Logging Traces of Web Activity: The Mechanics of Data Collection at the WWW2006 Conference in Edinburgh, Scotland in May 2006.

We invite position papers for the WWW 2006 workshop “Logging Traces of Web Activity: The Mechanics of Data Collection”. Many WWW researchers require logs of user behaviour on the Web. Researchers study the interactions of web users, both with respect to general behaviour and in order to develop and evaluate new tools and techniques.

Traces of web activity are used for a wide variety of research and commercial purposes including user interface usability and evaluations of user behaviour and patterns on the web. Currently, there is a lack of available logging tools to assist researchers with data collection and it can be difficult to choose an appropriate technique. There are several tradeoffs associated with different methods of capturing log-based data. There are also challenges associated with processing, analyzing and utilizing the collected data.

This one day workshop will examine the trade-offs and challenges inherent to the different logging approaches and provide workshop attendees the opportunity to discuss both previous data collection experiences and upcoming challenges. The goal of this workshop is to establish a community of researchers and practitioners to contribute to a shared repository of logging knowledge and tools. The workshop will consist of a panel discussion, participant presentations, demonstrations of logging tools and prototypes, and a discussion of the next steps for the group. Participation is open to researchers, practitioners, and students in the field.

The deadline for workshop proposals is January 10, 2006. I hope to see you there.

New Book: Theories of Information Behavior

I am remiss in mentioning that a new book, Theories of Information Behavior, I have written a chapter for is finally out.

From the blurb:

This unique book presents authoritative overviews of more than 70 conceptual frameworks for understanding how people seek, manage, share, and use information in different contexts. A practical and readable reference to both wellestablished and newly proposed theories of information behavior, the book includes contributions from 85 scholars from 10 countries. Each theory description covers origins, propositions, methodological implications, usage, links to related conceptual frameworks, and listings of authoritative primary and secondary references. The introductory chapters explain key concepts, theory, method connections, and the process of theory development.

Check out the Table of Contents (pdf file). (I’m the last chapter in the book, it’s funny that the chapters are organized alphabetically by the title of each chapter.)

Amazon.com link to Theories of Information Behavior. American Society for Information Science & Technology Member Price is 20% off now.

SIGIR 2006 Call for Papers

The ACM Special Interest Group for Information Retrieval (SIGIR) has thier SIGIR 2006 Draft Call for Papers out already. The conference will be in Seattle next August.

SIGIR is one of the best academic conferences to keep up with what’s new and what’s possible for Web search and increasingly, in Desktop search and mobile device search. For 2006 I expect we will see more about vertical search and even blog search too as well as some new insights into user behavior for IR.

Call for Papers: WWW2006 Conference

New notice for participation at the 15th Annual World Wide Web conference in Edinburgh, Scotland (one of my favorite cities).

I will be a reviewer again this year in the Browsers and User Interface track, where there are usually a number of amazing systems and interfaces. Here’s some text describing the track:

The Browsers and User Interfaces track at WWW’2006 focuses on promoting novel research directions and providing a forum where researchers, theoreticians, and practitioners can introduce new approaches, paradigms, applications, share their knowledge and opinions about problems and solutions related to accessing and interacting with data , services, and other humans over the Web. We invite original papers describing both theoretical and experimental research including (but not limited to) the following topics:

  • Browsers and user experience on mobile devices
  • Browser interoperability
  • Novel client-side applications
  • Multimodal interfaces, including speech interaction
  • Information visualization on the Web
  • Multilingual Web content design
  • Novel browsing and navigation paradigms
  • Web interaction with the real world, including robotics and sensor networks
  • Adaptive Web displays and Web personalization
  • Ubiquitous web access, shared displays, and wearable computing
  • Web usability and user experience
  • Web accessibility
  • Web-based collaboration and collaborative Web use
  • Web-logs and online journalism

Hope to see you there.

Study of Yahoo and Google Indices

A fresh approach at some analysis of which search engine has a more comprehensize index: A Comparison of the Size of the Yahoo and Google Indices. It would be interesting to see this study at another order of magnitude, perhaps with MSN included. What I like best is that the study authors released the code for the tests. I seem to be finding that more academics are providing code to let others attempt to verify their study firsthand, build on the study to make relatable comparisons, and most importantly to prodive the opportunity for peer review of the code logic of what the study claims.

Mining and automating adding buddies in Google Talk

Did you know that anyone with a gmail address by default has a Google Talk ID? Just for fun, I did a grep though all my email files for addresses that match the pattern “@gmail.com” and did a quick regex insert for some of them to the blist.xml file that GAIM and Adium use to keep your buddy lists. This was an easy way (well for me at least) to add a group of people to my buddy list. Next time one of your new invites logs on, they get an invitation from you to be added to their Google Talk buddy list.

So if you get an invite from me, now you know why… Well maybe not actually why, but at least you know how.

If I didn’t send you an invite, try me. I’ll give you just one guess at my Google Talk ID.

International Verify your Backups Day 9/9

I know we already have a lot of holidays and special occasions in September but I think we need another one. Let’s make September 9th, International Verify your Backups Day. On 9/9 it seems like a good idea to make sure that at least 99% of the files you’ve been backing up can be recovered, if not why back things up?

I am certain that many of us are sporadically dutiful in using backup software, compressing a bunch of files and copying them to a CD or syncing with a backup server. All too often this labor is lost when we can’t actually recover or make sense of what we recover when we need to (and there will always be a time when you need to recover some data). Why not spend a few minutes making sure that all of that effort isn’t in vain? Try and recover some of your old files and make sure they’re file-liciously fresh and usable!

Yes, for some of you, this means that September 8th will be International Backup Day – but that’s OK, at least you’re backing up your valuable data.

How do I backup? I work on three different systems (with four different operating systems between them, sigh) and try to keep most of my working files in one main directory that’s the same on each. I routinely compress and back up this directory into one large file and make the date of the backup part of the file name (as in 08-08-2005-Docs.zip). Then, I copy this file to another hard disk as well as burn this file to a CD, label it with a Sharpie marker and store it in my home or office (alternating between the two). I also have specific configuration files for each system I work on and I back those up too with a combination of small scripts (to run a copy, merge and compress sequence) and then either keep the backup on the particular system in a directory called “backup”, SFTP to a server or burn those to CD less frequently. I usually do not worry about backing up whole applications since in most cases it’s easier to re-installl an application than manage a huge backup file. Much less frequently, I use a full disk backup application (like Retrospect, which I really don’t care for so much) and keep the giant backup file on an external 250GB hard disk.

For other content like all my music files, I just do a full copy to an external drive (I have three external drives, all at least 250GB in size) and rotate among them.

I have tried many other systems, like using version control, automated .Mac-like backup services, and any number of personal or large-scale sync applications (more on them in a later post), but none seem to have the simplicity of what I’m using now.
How often do you backup your data? How do you do it?