research

Methodologies for Understanding Web Use with Logging in Context

Posted in research, science, tech on April 30th, 2010 by donturn – Be the first to comment

Methodologies for Understanding Web Use with Logging in Context

[PDF]

Don Turnbull

Abstract

This paper describes possible approaches of data collection and analysis methods that can be used to understand Web use via logging. First, a method devised by Choo, Detlor, & Turnbull (1998, 1999 & 2000) that can be used to offer a comprehensive, empirical foundation for understanding Web logs in context by gaining insight into Web use from three diverse sources: an initial survey questionnaire, usage logs gathered with a custom-developed Web tracking application and follow-up interviews with study participants. Second, a method of validating different types of Web use logs is proposed that involves client browser trace logs, intranet server and firewall or proxy logs. Third and finally, a system is proposed to collected and analyze Web use via proxy logs that classify Web pages by content.

Excerpt

It is often thought that in some configurations, client browsing application local caching settings may influence server-based logging accuracy. If it is not efficient to modify each study participant’s browser settings (or that temporarily modifying participants browser settings for the study period affects true Web use) a method of factoring in what may be lost due to local cache may be applied. … By tuning intranet server logging settings and collecting and analyzing these logs, some initial measurement of the differences that client browser caching makes in accurate firewall logs can be made. Comparisons to access on the organizations intranet Web server logs such as total page requests per page, time to load, use of REST or AJAX interaction and consistent user identification can be made to the more raw logging from the firewall logs collected

Update

What’s novel about this paper is the introduction of using different datasets to validate or triangulate the veracity and accuracy of log data. Often, logs are collected and processed without context to explain subtle interaction patterns, especially in relation to user behavior. By coordinating a set of quantitative resources, often with accompanying qualitative data, a much richer view of Web use is achieved. This is worth remembering when relying on Web Analytics tools to form a picture of a Web site’s use or set of Web user interactions: you need to go beyond the basic statistical measures (often far beyond what typical log analysis software provides, certainly by their default reports) and design new analysis techniques to gain understanding.

Keywords

browser history, firewall logs, intranet server logs, web use, survey, questionnaire, client application, webtracker, interview, methodology, logs, server logs, proxy, firewall, analytics, content classification, client trace, transaction log analysis, www

Cite As

Turnbull, D. (2006). Methodologies for Understanding Web Use with Logging in Context. Paper presented at the The 15th International World Wide Web Conference, Edinburgh, Scotland.

References in this publication

  • Auster, E., & Choo, C. W. (1993). Environmental scanning by CEOs in two Canadian industries. Journal of the American Society for Information Science, 44(4), 194-203.
  • Catledge, L. D., & Pitkow, J. E. (1995). Characterizing Browsing Strategies in the World-Wide Web. Computer Networks and ISDN Systems, 27, 1065-1073.
  • Choo, C.W., Detlor, B. & Turnbull, D. (1998). A Behavioral Model of Information Seeking on the Web — Preliminary Results of a Study of How Managers and IT Specialists Use the Web. Proceedings of the 61st Annual Meeting of the American Society of Information Science, 290-302.
  • Choo, C.W., Detlor, B. & Turnbull, D. (1999). Information Seeking on the Web – An Integrated Model of Browsing and Searching. Proceedings of the 62nd Annual Meeting of the American Society of Information Science, Washington, D.C.
  • Choo, C.W., Detlor, B. & Turnbull, D. (2000). Web Work: Information Seeking and Knowledge Work on the World Wide Web. Dordrecht, The Netherlands, Kluwer Academic Publishers.
  • Cuhna, C.R., Bestavros, A. & Crovella, M.E. (1995). Characteristics of WWW Client-Based Traces. Technical Report #1995-010. Boston University, Boston MA.
  • Flanagan, J. C. (1954). The critical incident technique. Psychological Bulletin 51(4), 327-358.
  • Jansen, B. J., Spink, A. & Saracevic, T. (2000) Real life, real users, and real needs: a study and analysis of user queries on the Web. Information Processing & Management, Volume 36, Issue 2, pp 207-227.
  • Jansen, B. J. (2005) Evaluating Success in Search Systems. Proceedings of the 66th Annual Meeting of the American Society for Information Science & Technology. Charlotte, North Carolina. 28 October – 2 November.
  • Kehoe, C., Pitkow, J. & Rogers, J. (1998). GVU’s Ninth WWW User Survey Report. http://www.gvu.gatech.edu/user_surveys/survey-1998-04.
  • Pitkow, J. and Recker, M. (1994). Results from the first World-Wide Web survey. Special issue of Journal of Computer Networks and ISDN systems, 27, 2.
  • Pitkow, J. (1997, April 7-11). In Search of Reliable Usage Data on the WWW. Sixth International World Wide Web Conference Proceedings, Santa Clara, CA.
  • Rousskov, A. & Soloviev, V. (1999) A performance study of the Squid proxy on HTTP/1.0. World Wide Web., 2, 1-2, pp 47 – 67.

Publications that cite this publication

Related Articles

Recommended Reading

Jansen, B.J. and Ramadoss, R. and Zhang, M. and Zang, N. (2006). Wrapper: An application for evaluating exploratory searching outside of the lab. EESS, p 14.

Rating, Voting & Ranking: Designing for Collaboration & Consensus

Posted in research, tech on April 26th, 2010 by donturn – Be the first to comment

Rating, Voting & Ranking: Designing for Collaboration & Consensus

[PDF]

Don Turnbull

Abstract

The OpenChoice system, currently in development, is an open source, open access community rating and filtering service that would improve upon the utility of currently available Web content filters. The goal of OpenChoice is to encourage community involvement in making filtering classification more accurate and to increase awareness in the current approaches to content filtering. The design challenge for OpenChoice is to find the best interfaces for encouraging easy participation amongst a community of users, be it for voting, rating or discussing Web page content. This work in progress reviews some initial designs while reviewing best practices and designs from popular Web portals and community sites.

Excerpt

…Tim O’Reilly proposed the phrase “architecture of participation” to describe participatory Web sites and applications that encourage user-driven content, open source contribution models and simple access via APIs. So why are so many of these sites and applications under-designed at the interface and interaction level, not to mention having vaguely architected overall structure? Many of these sites are relying on the (initial) enthusiasm of users or their compelling features to keep and encourage participation. However more attractive and functional interfaces with clear labels, (usability) tested interfaces, finely crafted workflows and consistent interaction models would both keep early adopters involved and allow for easy bootstrapping for late-comers. When designing participatory, community-oriented sites, designers shouldn’t have to re-invent everything from scratch.

…popular community sites feature common interface elements and functionality:

  • Overall voting and rank status easy to read
  • Dynamically updated interaction
  • Thumbnail, abstract or actual content of item on same page as voting interface
  • Rating information for community at large for the item
  • Suggestions or lists for additional items to rate
  • Textual description of (proposed) item category with link to category
  • Links to related and relevant discussions about item (or item category)
  • Standard interface objects (where appropriate) to leverage existing Web interaction (e.g. purple & blue links colors, tabbed navigation metaphor, drop-down lists)
  • Show history of ratings or queue of items to vote on
  • Aggregate main page or display element that shows overall community ratings (to encourage virtuous competition for most ratings)
  • Task flow for voting or rating clear with additional interactions not required (e.g. following links)

…In addition to dynamic voting status, there is some consideration of simplifying the voting to include “allow” vs. “block” ratings only. Design issues such as the colors of the buttons may also overly influence certain votes.

Basic Voting Interface and Voting History
As part of each user’s own customized portal page, a history of recent votes is prototyped to give users the ability to remember their past votes and see the status of pending items in consideration.

Keywords

information interfaces: Graphical User Interfaces, user interfaces, reputation systems, social computing

Cite As

Turnbull, D. (2007). Rating, Voting & Ranking: Designing for Collaboration & Consensus. Paper presented at the Association of Computing Machinery Computer Human Interface Conference (SIGCHI), San Jose, CA.

References in this publication

Publications that cite this publication

  • Galway, D. (2008) Real-life Rating Algorithm [PDF].

Related Articles

Recommended Reading

Building Web Reputation Systems by Randy Farmer and Bryce Glass at Building Web Reputation Systems: The Blog.

Personalized Search

Posted in publications, research, search on April 23rd, 2010 by donturn – Be the first to comment

Personalized Search: A Contextual Computing Approach May Prove a Breakthrough in Personalized Search Efficiency

[PDF]

James Pitkow, Hinrich Schuetze, Todd A. Cass, Rob Cooley, Don Turnbull, Andy Edmonds, Eytan Adar, et al.

Abstract

A contextual computing approach may prove a breakthrough in personalized search efficiency.

Excerpt

Contextual computing refers to the enhancement of a user’s interactions by understanding the user, the context, and the applications and information being used, typically across a wide set of user goals. Contextual computing is not just about modeling user preferences and behavior or embedding computation everywhere, it’s about actively adapting the computational environment – for each and every user – at each point of computation. (p 50)

The Outride system was designed to be a generalized architecture for the personalization of search across a variety of information ecologies.(p 52)

Search Engine - Average Task Completion Time in Seconds

While the results may seem overwhelmingly in favor of Outride, there are some issues to interpret. First, some of the scenarios contained tasks directly supported by the functionality provided by the Outride system, creating an advantage against the other search engines. Indeed, Outride features are specifically designed to understand users, provide support by the conceptual model and tasks users employ to search the Web, and to contextualize the application of search. This is the goal of contextual computing and why personalizing search makes sense.

Second, while the use of default profiles could have provided an advantage for Outride, it also could have negatively influenced the outcome, as the profile did not represent the test participants’ actual surfing pat- terns, nor were the participants intimately familiar with the content of the profiles. Third, some of the gains are likely due to the user interface since the Outride sidebar remains visible to users across all interac- tions, helping to preserve context and provide quick access to core search features. For example, while search engines require users to navigate back and forth between the list of search results and specific Web pages, Outride preserves context by keeping the search results open in the sidebar of the Web browser, making the contents of each search result accessible to the user with a single click. Still, the magnitude of the difference between the Outride system and the other engines is compelling, especially given that most search engines are less than 10% better than one another. (p 54)

Keywords

information retrieval, search, information seeking, relevance feedback, personalization, contextual computing, user interfaces, search process

Cite As

Pitkow, J., Schutze, H., Cass, T., Cooley, R., Turnbull, D., Edmonds, A., et al. (2002). Personalized Search: A Contextual Computing Approach May Prove a Breakthrough in Personalized Search Efficiency. Communications of the ACM, 45(9), 50-55.

References in this publication

  • Anderson, J.R. Cognitive Psychology and Its Implications. Freeman, San Francisco, CA, 1980.
  • eTesting Labs. Google Web Search Engine Evaluation; www.etestinglabs.com/main/reports/google.asp
  • Pirolli, P. and Card, S.K. Psychological Review 106, 4 (1999), 643–675.
  • Gerard Salton , Michael J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, Inc., New York, NY, 1986

Publications that cite this publication

Advertising Academia With Sponsored Search: an exploratory study examining the effectiveness of Google AdWords at the local and global level

Posted in publications, research on April 22nd, 2010 by donturn – Be the first to comment

[PDF]

Don Turnbull, Ph.D and Laura F. Bright

Abstract

An exploratory study conducted in late autumn and early winter 2006-2007 investigates the purchasing of sponsored search advertising for a major US university’s academic department. The ad campaign used Google’s AdWord service with the goal of increasing awareness of the academic department and encouraging potential graduate admissions or admissions inquiries. A behavioral model of information seeking is suggested that could be applied for selecting appropriate types of online advertising for awareness and other advertising goals. The study found little overlap between traditional, commerce-oriented online advertising methods and a general awareness campaign, as evidenced by a low click-through rate to the targeted site. Insights for future studies include increased integration with server logs, targeted site query terms and alternative awareness strategies.

Keywords

sponsored search; online advertising; search engines; behavioral model; information seeking; electronic business; Google.

Cite As

Turnbull, D. and Bright, L.F. (2008) Advertising academia with sponsored search: an exploratory study examining the effectiveness of Google AdWords at the local and global level, Int. J. Electronic Business, Vol. 6, No. 2, pp.149-171.

References in this publication

  • Ad Age Search Marketing Fact Pack (2006) Published by eMarketer on 6th November, Retrieved online on 01/19/07.
  • Allen, T.J. (1977) Information needs and uses, Annual Review of Information Science and Technology, Vol. 4, pp.3-29.
  • Chang, S. and Rice, R. (1993) Browsing: a multidimensional framework, Annual Review of Information Science and Technology, Vol. 23, p.242.
  • Cho, C. (2003) Factors influencing the clicking of banner ads on the World Wide Web, Cyberpsychology and Behavior, Vol. 6, No. 2, pp.201-215.
  • Cho, C–H. and Cheon, H.J. (2004) Why do people avoid advertising on the internet?, Journal of Advertising, Vol. 33, No. 4, pp.89-97.
  • Choo, C.W., Detlor, B. and Turnbull, D. (1998) A behavioral model of information seeking on the web – preliminary results of a study of how managers and IT specialists use the web, Proceedings of the 61st Annual Meeting of the American Society of Information Science, Published for the American Society for Information Science by Information Today Inc., Pittsburgh, PA.
  • Choo, C.W., Detlor, B. and Turnbull, D. (2000a) Web Work: Information Seeking and Knowledge Work on the World Wide Web, Kluwer Academic Publishers, Dordrecht, The Netherlands.

  • Choo, C.W., Detlor, B. and Turnbull, D. (2000b) Working the web: an empirical model of web use, 33rd Hawaii International Conference on System Science (HICSS), Maui, HI.
  • Coulter, R.A., Zaltman, G. and Coulter, K.S. (2001) Interpreting consumer perceptions of advertising: an application of the Zaltman metaphor elicitation technique, Journal of Advertising, Vol. 30, No. 4, pp.1-21.
  • Coyle, J.R. and Thorson, E. (2001) The effects of progressive levels of interactivity and vividness in web marketing sites, Journal of Advertising, Vol. 30, No. 3, pp.277-289.
  • Ellis, D. and Cox, D. (1993) A comparison of the information seeking patterns of research scientists in an industrial environment, Journal of Documentation, Vol. 49, No. 4, pp.356-369.
  • Ellis, D. (1989) A behavioural approach to information retrieval systems design, Journal of Documentation, Vol. 45, No. 3, pp.171-212.
  • Ellis, D. (1997) Modelling the information seeking patterns of engineers and research scientists in an industrial environment, Journal of Documentation, Vol. 53, No. 4, pp.384-403.
  • Fain, D.C. and Pederson, J.O. (2006) Sponsored search: a brief history, American Society for Information Science and Technology Bulletin, Retrieved on 11/15/06 from http://www.asis.org/Bulletin/Dec-05/pedersen.html, January, Special Issue.
  • Feng, J., Bhargava, H.K. and Pennock, D.M. (2005) Implementing sponsored search in web search engines: computational evaluation of alternative mechanisms, Journal of Computing, Vol. 19, No. 1, Winter 2007, pp.137-148.
  • Friestad, M. and Wright, P. (1994) The persuasion knowledge model: how people cope with persuasion attempts, Journal of Consumer Research, Vol. 21, pp.1-31.
  • Godin, S. (1999) Permission Marketing, Simon and Schuster, New York. Goral, T. (2003) Intelligent admission, University Business, Vol. 6, No. 3, pp.38-41. Holahan, C. (2006) Click Fraud: Google Comes Clean, Sort Of, BusinessWeek, 27 July. Jansen, B.J. (2007) Click fraud, IEEE Computer, Vol. 40, No. 7, pp.85-86.
  • Jansen, B.J. and Resnick, M. (2006) An examination of searchers perceptions of non-sponsored and sponsored links during e-commerce web searching, Journal of the American Society of Information Science and Technology, Vol. 57, pp.1949-1961.
  • Li, H., Edwards, S. and Lee, J-H. (2002) Measuring the intrusiveness of advertisements: scale development and validaton, Journal of Advertising, Vol. 31, No. 2, pp.37-47.
  • Marchionini, G. (1995) Information Seeking in Electronic Environments, Cambridge University Press, Cambridge.
  • Moore, R.S., Stammerjohan, C.A. and Coulter, R.A. (2005) Banner advertiser-website context congruity and color effects on attention and attitudes, Journal of Advertising, Vol. 34, No. 2, pp.71-84.
  • Shamdasani, P.N., Stanaland, A.J.S. and Tan, J. (2001) Location, location, location: insights for advertising placement on the web, Journal of Advertising Research, Vol. 41, No. 4, pp.7-21.
  • Stone, B. (2007) Dont like dancing cowboys? Results say you do, New York Times, Media and Advertising Section, 18th January.
  • Sutton, S.A. (1994) The role of attorney mental models of law in case relevance determinations: an exploratory analysis, Journal of the American Society of Information Science, Vol. 45, No. 3, pp.186-200.
  • Taylor, R.S. (1986) Value Added Processes in Information Systems, Ablex Publishing Corp., Norwood, NJ.
  • Xing, B. and Lin, Z. (2004) The impact of search engine optimization on online advertising market, ACM Conference Proceedings, Winter, pp.519-530.

Publications that cite this publication

Google Scholar Citations

Related Articles

Quantitative Information Architecture at the 2010 Information Architecture Summit

Posted in information_architecture, research, science, tech on April 6th, 2010 by donturn – 2 Comments

I am presenting on two different topics at the 2010 Information Architecture Summit in Phoenix this week.

The first talk is a set of ideas related to the work I’ve been doing recently, building data structures, crafting algorithms and designing user experiences that are powered by quantitative data.

Quantitative Information Architecture – Don Turnbull, Ph.D.

10:30 – 11:15AM on Saturday, April 10 in Ellis

Why quantitative information architecture? Why now?

You don’t have to be RainMan or Stephen Hawking to use numbers to get things
done. Quantitative methods are applicable for IA thinking be it for hypothesis
generation, instrumentation, data collection and analysis of information at
scales never before possible with insights that are comparable over time,
generalizable and extensible.

Quantitative skills can allow IAs to interpret and analyze others’ designs and
research more readily, as well as combine methods and models for meta-analysis
to help IAs move from description to prediction in designing and developing
future interfaces and architectures.

This presentation will review why you should use quantitative methods and
discuss both foundational and emerging ideas that are applicable for content
analysis, behavioral modeling, social media usage, informetrics and other
IA-related issues.

The twitter hashtag for this talk is #quantia. Feel free to send me questions directly via twitter/donturn too.

Quantiative Information Architecture slide deck from the 2010 IA Summit

Science 2.0: Globalized Innovation in Electronics talk at UTexas

Posted in austin, research, science, tech on October 17th, 2008 by donturn – Be the first to comment

Next Tuesday, October 21, 2008 @ 5:30 pm -7:30 pm at the University of Texas LBJ Library Brown Room, 10th Floor there looks to be an interesting talk:

Strauss Center :: Science 2.0: Globalized Innovation in Electronics by Dan Hutcheson, CEO, VLSI Research

Dan Hutcheson, of VLSI Research, Inc., is a recognized authority and well-known visionary for the semiconductor industry. He advises companies in strategic and tactical marketing, business management and manufacturing trends, productivity and strategy. Mr. Hutcheson developed the industry’s first cost-of-ownership model and the first factory cost-optimization model in the 1980s.

This presentation is part of the Strauss Center’s Technology, Innovation and Global Security Speaker Series, which brings world-renowned experts to campus to discuss how to sustain innovation and better utilize modern technology to benefit an increasingly global economic and social system.

Advertising & Awareness with Sponsored Search: an exploratory study examining the effectiveness of Google AdWords at the local and global level

Posted in research, science, search, tech on October 16th, 2008 by donturn – Be the first to comment

I will be giving a research talk (added recently, thus not on the conference Web page yet) titled: Advertising & Awareness with Sponsored Search:  an exploratory study examining the effectiveness of Google AdWords at the local and global level on October 28 at the American Society of Information Science & Technology (ASIS&T) 2008 Annual Meeting (AM08) in Columbus, Ohio.

This is the abstract for the talk:

This talk reviews an exploratory study of sponsored search advertising for a major US university’s academic department. The ad campaign used Google’s AdWord service with the goal of increasing awareness – not eCommerce – as part of the search process.  A behavioral model of information seeking is suggested that could be applied for selecting appropriate types of online advertising for awareness and other advertising goals. Insights into the study methodology will also be discussed including the use of increased integration with server logs, targeted site query terms and alternative awareness strategies. 

The talk is part of the panel AM08 2008 – The Google Online Marketing Challenge: A Multi-disciplinary Global Teaching and Learning Initiative Using Sponsored Search with Bernard Jansen, Mark A. Rosso, Dan Russell, Brian Detlor and Don Turnbull.

This is a summary of the panel:

Sponsored search is an innovative information searching paradigm. This panel will discuss a vehicle to explore this unique medium as an educational opportunity for students and professors. From February to May 2008, Google will run its first ever student competition in sponsored search, The Google Online Marketing Challenge http://www.google.com/onlinechallenge/. Similar to other Google initiatives, the extent seems huge. Based on pre-registrations, more than two hundred professors and nearly nine thousand students from approximately 50 countries will compete. This may be the largest, worldwide educational course ever done. It is certainly on a large scale.

The Google Online Marketing Challenge is a real-life, problem-based, and multidisciplinary educational endeavor of the kind that many educators say is needed to relate teaching to outside the classroom. However, such endeavors are not without risks. The session should appeal to professors that competed in the 2008 Challenge, any professors considering the 2009 Challenge, as well as other educators who might consider the inclusion of Google AdWords as a pedagogical tool in their curricula. The panel will also be of great interest to those information professionals and educators as a possible model for use in other domains besides sponsored search.

Knowledge Management Systems

Posted in research on September 29th, 2008 by donturn – 1 Comment

This Fall 2008 semester at the University of Texas, I’m teaching a course on: Knowledge Management Systems

This course surveys Knowledge Management systems that enable the access and coordination of knowledge assets. Technologies reviewed will include intranets, groupware, weblogs, instant messaging, content management systems and email in both individual and organizational contexts. Students will use these KM technologies, review case studies, research methods of knowledge organization and analyze and design KM processes and systems.

The course is chock full of fun topics including:

Of course, we have a class blog too: Knowledge Management Systems @UTexas

Semantic Web Technologies

Posted in research, semantic_web on September 29th, 2008 by donturn – Be the first to comment

This Fall 2008 semester at the University of Texas, I’m teaching a course on: Semantic Web Technologies

This course approaches understanding Semantic Web technologies from three perspectives:

  • Top-down, theoretical approaches to organizing semantic information including ontologies, taxonomies, knowledge representation and software agents.
  • Bottom-up approaches, sometimes called “emergent semantics” or “the lower case ‘S’ semantic web”, for understanding and creating networked information including XML-based solutions including RDF, XPath and RSS. Also included are smaller, informal systems for organizing Web information including tagging (social bookmarking), microformats and other specific markup and distribution systems.
  • Application approaches focusing on Web Services or “Web 2.0″ functionality including distributed (client and server) application design, syndication, Application Programming Interfaces, remote databases and “mash-ups”.

Of course, we have a class blog too: Semantic Web Technologies Blog.

Information Architecture Institute Progress Grants

Posted in information_architecture, research, science, tech on September 19th, 2007 by donturn – Be the first to comment

I’m pleased to announce (or remind) that the Information Architecture Institute is accepting applications for the Information Architecture 2007 Progress Grants

The Information Architecture Institute (IAI) will award two USD $1,000 Progress Grants for 2007. The purpose of the program is twofold:

  • to encourage researchers and practitioners to investigate IA-specific issues
  • to publicize useful work that furthers the information architecture body of knowledge

Applications should propose work that will forward the theory and practice of information architecture. This can include original research, a synthesis of important existing research, or the development of an innovative new technique.

The IAI Progress Grant Committee will review the proposals and select those with the highest potential to benefit the information architecture field. Half of the grant amount will be awarded when the grant recipients are announced and half when the work is completed. Progress grants will only be awarded to proposals of sufficient quality, clarity, and originality.

Work supported through this program will be published on the iainstitute.org website, but it should have relevance beyond the Tools and Library collections. For instance, the work could inform future IAI workshop curricula, tie in with potential Institute publishing projects, be responsive to issues raised by members in the email discussion list, or support other Institute activities, such as Local Groups and International initiatives.

The application deadline for applying is October 15, 2007

Applications should be 2,000 words or fewer and must contain:

  • Description of the problem or hypothesis
  • Methodology to be used
  • Explanation of how the resulting work will forward the theory or practice of IA
  • Conditions under which others can use the results (e.g. Creative Commons license)

(Note that I’m on the Awards Jury Committee for this grant.)

Learn more about the Information Architecture 2007 Progress Grants now.