Early blog features
Anne Helmond, Vera Bekema, Bram Nijhof, Niels Kerssens, Simon Marschall, Elena Tiis, Tjerk Timan.
- How do we define the early blogosphere?
- What was unique about these early blogs?
- Did these early blogs enable common features and if so, which ones were developed and which ones neglected?
- Which features have been taken up by corporate blog hosting sites and became standards?
- Which features have not been standardized?
One way to map out the sphere of early blogs could be seen approaching it through the Eatonweb blog directory. Eatonweb could be considered the end of the early blogosphere as Bridgette Eaton was no longer able to keep up with the growth of the blogosphere/the number of blogs. The introduction of the Blogger platform which popularized the act of blogging may also be seen as a marking point of the early blogosphere. To study the early blogosphere we use the Internet Archive which seems to breaks the early blogosphere archive instead of preserve it.
- list of platforms (most important ones)
- acronymes for the features
- if we find features: when are they implemented?
- how do we tell the story of the features, why did some change or why did some features die?
Problem: archive.org incomplete (missing files, robot.txt, etc)
By using the Wayback Machine on archive.org
, an entry of the Eatonweb directory could be retrieved which is considered one of the last possible attempts to manually name all existing blogs on one page ( blogosphere as mapped by brigitte eaton in mid 2000
). On basis of this archived entry, another group of the DMI summer school '09 created a spreadsheet containing all URLs of the blogs mentioned (see attachment_ Blogosphere_as_listed_by_Eatonweb_2000.xls
As a second step, we took the 20 first results and queried versions of these blogs through the archive.org Wayback Machine. Looking at all site updates from the earliest entry up until 15. August 2001, we manually tried framing features of these often highly individual pages. This proved particularly difficult, as much of the elements of which people composed their sites would only later become features with steady names and attributes. Reviewing our list, we tried to create a list which could be used to query its terms. As a compromise between recognizing the high level of individualization of the early blogosphere (i.e. individualized terminology for common features) and the interest in making changes over time visible, our final list contained many items which are still present nowadays but also contains specific elements which we recognized as being more common in this early stage of the blogosphere (see attachment_ Blog_features__platforms_over_time_1999-2001.xls ).
We did this by separating 'feature name as given by user' and 'feature description' if the feature would match (now) common blog features.
Further, we also created a list referring to the platforms on which early blogs were hosted (see attachment Blog_platforms_categorized.xls ).
However, this list was not based on the previous top 20 but collected on basis of the results retrieved by the Wayback Tool (see below).
After creating of these lists of features/platforms, we used the DMI Wayback Tool
to query these terms within specific date ranges. Through creating an 'archive of the archive', the Wayback Tool allows the querying of specific terms within the range of its collection. Three main collections were set up, each referring to the site update closest to the midst of the years 1999, 2000 and 2001. Each annual collection was then queried with the both features terms and the platform names. However, when a term is mentioned within a site it doesn't mean that the site is using the feature. The word can refer to something else or it can have another meaning. Thus, editorial work had to be done to create a valuable output.
On basis of the amount of features and platforms in relation to the total amount of weblogs per year, we could create percentages to better show developments over time.
Total number of pages which where searched in the eatonweb collection for each year: 190 (1999), 761 (2000) and 769 (2001).
Percentages: 100 / Xtbpy * Xbf = %, respectively 100 / Xtbpy * Xbp
The results were then visualized by arranging them horizontally in terms of functions (i.e. category) or the blogging service, respectively. Elements are ordered vertically over time, the nucleus of each feature is sized proportionally and in relation to the other data, displaying its presence in the blogosphere in percent and thus show the development of a feature over time. Each cell's skin size is equal to a 100% saturation of the blogosphere with a particular feature; it is colored on basis of categorization.
(a) Development of features of early blogosphere over time (1999/2000/2001)
(b) Platforms of the early blogsphere (to be completed)