Accessibility has been defined by the World Wide Web Consortium's (W3C) Web Accessibility Initiative (WAI) as the need to "create Web content that is perceivable, operable, and understandable by the broadest range of users, and robust enough to work with current and future technologies" [11]. The WAI initiative acts as the central point for setting accessibility guidelines for the Web. However, the application of guidelines is often insufficient to guarantee accessibility. Recent research conducted on 1,000 U.K. Web sites on behalf of the U.K.'s Disability Rights Commission [6] demonstrates that most of these sites are quite difficult to access, and that 45% of the discovered problems cannot be considered WAI guidelines violations, but rather attributed to unclear and confusing organization of pages, and confusing and disorienting navigation mechanisms.
Indeed, the WAI notion of accessibility is fundamentally focused on properties of the page mark-up code that makes page contents readable by technologies for assisting disabled users. Consequently, WAI guidelines promote presentation accessibility. However, Web applications should primarily support users in identifying, retrieving, and navigating contents [7, 10]. This observation is even more relevant for the development of data-intensive Web applications, such as digital libraries, electronic catalogs, or institutional sites, whose main focus is providing access to a large quantity of data and services.
We define content accessibility as the property of a Web site of delivering well-organized information. Content accessibility requires a clear identification of few core contents that synthetically convey the information of the entire application, and then the repeated use of few, well-designed access patterns, so as to give to users the impression of mastering the process of retrieval and navigation. Content accessibility is essential: when it is lacking, even the most effective use of presentation facilitations does not make a site accessible. In this article, we discuss methods and heuristics to achieve content accessibility, which depend on a few fundamental principles: being driven by high-level models and approaching the problems in a top-down fashion. In addition, this article presents Web marts, a new abstraction that, similarly to data marts in data warehousing, helps shaping the information delivered by a data-intensive Web application, and the hypertext interface enabling access to that information.
Modeling content is perhaps the most important aspect of data-intensive Web applications. When content is used for specific purposes, it is possible to recognize special patterns that facilitate the definition and organization of information. For example, data warehousing experts have invented the notion of the data mart as a particular conceptual schema having one entity, describing facts, surrounded by multiple entities, describing the dimensions of data analysis. For this particular topology, data marts are also known as star schemas. Users employ dimensions for selecting facts, and suitable tools for computing their aggregate properties; a fixed topology enables the definition of specific data management operations, such as slicing and dicing (progressively adding and dropping dimensions for a given data analysis) or pivoting (turning the data mart along its dimensions).
In analogy with data marts, we have recognized a repeating pattern within data-intensive Web applications that can also facilitate the definition and organization of information; we call it Web mart. Web marts have a central entity, the core concept, and several surrounding entities: access entities, enabling selection through navigation of core concepts, and detail entities, describing core concepts in greater detail.
Figure 1 illustrates the difference between a data mart and a Web mart. The two patterns are represented using the Entity-Relationship (E-R) notation [3], in which boxes denote classes of information objects, called entities, and lines denote relationships among entity instances. While in data marts all dimensions have an equivalent role, in a Web mart access entities and detail entities have distinct roles. Moreover, while in star schemas all the information about access dimensions is factored into a single entity, Web marts have access dimensions organized hierarchically, typically with two or three levels, and the detail entities may show interconnectionssometimes evolving in a structured subschema.
Single Web marts need further concepts for making the navigation among them possible. Such concepts globally constitute the interconnection schema, which serves the purpose of enabling the navigation between core concepts. With such additional ingredients, the E-R schema of a data-intensive Web application can be described by highlighting its (few) core concepts, each with an access subschema (including access entities) and a core subschema (including detail entities), and the interconnection schema linking core concepts.
Web marts are the building blocks of content-accessible data-intensive Web applications.
We have observed many conceptual models of data-intensive Web applications, and discovered Web marts within the most intricate E-R graphs. Their occurrence as recurrent patterns within Web applications is also recognized by the findings on Web self-similarity [4], describing the Web as a repeating structure whose levels include centers of gravity (representing the core concepts) with highway links interconnecting them (representing the interconnection schema) and links incoming to the centers of gravity (connecting access entities to them) or outgoing from the centers of gravity (connecting core concepts to detail concepts). These empirical findings match well with the assumption of Web marts as the underlying ingredient of Web applications. Web marts are the building blocks of content-accessible data-intensive Web applications. As illustrated here, our method offers a number of practical recommendations regarding how to use them for achieving concept accessibility.
Our method assumes the adoption of a model-driven design of Web applications. Specifically, we expect that Web designers use both a data and a hypertext model. The former is a classical data model, for example, the E-R model or UML class diagrams. A hypertext conceptual model features the following properties:
Here, we use the WebML conceptual model [2], and describe some design activities to highlight the use of Web marts. However, several conceptual models for Web application design (see [5] for a survey) satisfy the same requirements.
Data design. The method we propose is data-driven, meaning the whole development method is based on data design. In order to enhance content accessibility, data design must address Web mart discovery and definition, according to the following activities.
Identify the core concepts. Every data-intensive Web application has a few core concepts. Consider company Web sites, whose core concepts include: company profile, products, success stories, vendors, personnel, community. Core concepts should be data-intensive, that is, have several instances; those concepts having one or very few instances are best described by textual files and should not be regarded as core concepts. For instance, the company profile is not data-intensive, and neither are the company founders, if they are limited to few well-defined individuals.
For every core concept, build a Web mart. Data-intensive core concepts are then modeled as entities with a rich collection of attributes, after which the focus moves to determining access and detail entities. Domain attributes are fundamental sources of inspiration to define access entities. Consider the city location of a vendor in a company Web site: it becomes an obvious access entity. Further grouping of cities into regions yields a classical access hierarchy, region/city/vendor. Well-established criteria for access design [8] can be adopted for defining the access path depth and the density of instances to be shown at each node of the access chain.
Finally, core concepts need to be further described, yielding to detail entities. For instance, each product is associated with a list of technical features, which needs to be modeled as a separate entity, as the relationship between products and technical specifications is one-to-many (every product has multiple details depending on several options).
Build the interconnection schema. Once Web marts are completed, the designer should think of possible interconnections between them, and build the interconnection schema. Interconnections may relate different instances of the same core concept, or instances of different core concepts. As a well-known example, consider Amazon.com; assume that Book and Author are core concepts. Then, the link from a book to its authors is a classical relationship belonging to the interconnection schema that can be used bi-directionally to present all authors of a book and then all the books of a given author.
Coarse hypertext design refers to design activities going up to the page level, pertaining to architecting the Web site structure. As mentioned previously, a Web application may feature different hypertexts (site views), each one providing a different view over the same data, addressing different users or roles. Given a site view, the following steps are required in order to enhance content accessibility.
For each Web mart, define an area of the site view, and then identify the main pages of each area. This is a classical top-down design; while area definition is almost immediate, page identification requires determining which pages are placed within each area, but this design decision is subject to reconsideration throughout the process. Normally, within each area one core page should be dedicated to each core concept; then other auxiliary pages may be required for accessing detail entities. The area core page is designated as the top page, and it is the first page being accessed when entering the area. Finally, note that complex Web applications may also require hierarchies of areas.
Identify landmarks. By landmarks, we denote pages that must be visible and reachable throughout the site, typically by means of persistent navigation bars. Generally, the landmark property is associated with areas' top pages. Being areas associated with core concepts, this implies that landmarks facilitate core concept access. Additionally, some notable pages, such as the home page, login, or search pages are also defined as landmarks. Landmark definition can result in a hierarchy of navigation bars: for example, higher-level global navigation bars provide access to the site areas; then, within each area, a local navigation bar may provide access to some notable areas' pages.
Detailed hypertext design consists of determining how pages display information by means of content units or enable the launch of operations such as buy, register, or download, each one started by clicking on a link, possibly after providing input via a form. Also, each page must include navigation mechanisms for accessing and browsing contents.
In order to enhance content accessibility, indexes supporting navigational access and input enabling direct search must be carefully included within pages. These two mechanisms are complementary; both consist of building access pages, whose primary purpose is publishing access entities contents, to locate core pages. Navigation occurs along the access entities, by selection of specific index entries, such as regions and cities of a vendor. Search occurs by entering given parameters within given forms associated with queries, for example, searching for vendors available for a given product. The result is typically a ranked list of core objects. Once core pages are reached, navigation-based mechanisms are still needed for browsing the detail entities, and for moving among different core concepts.
From a Web mart it is possible to derive "patterns" of Web pages whose primary function is to publish its components.
Presentation design aims at defining how contents and navigation controls must be placed and presented within pages. This is the stage where content accessibility becomes less important, and presentation accessibility is most significant.
Specify the page layout. This step consists of defining the page grid (a table containing a specific arrangement of rows, columns, and cells), representing the layout in which contents and navigation controls must be organized. Pages displaying the same type of content should comply with the same page layout. However, several page grids can be defined for pages of different types, including core, access, and auxiliary pages.
Identify elements positioning within pages. Rules must be established for assigning page elements, such as contents, navigation bars, login, and entry forms, to selected positions in the page grid. The aim is to obtain a consistent positioning across different pages of elements with similar semantics, with the effect of increasing orientation and reducing the users cognitive overhead while identifying the meaning of contents and navigation control.
Define the graphical style. Formatting rules must be defined for page graphic elements (for example, fonts and colors), which apply to such recurrent page elements as text, headings, anchors, tables, and so on. The definition of such rules also requires considering guidelines for presentation accessibility, such as those prescribed by W3C-WCAG [11], for ensuring accessibility by different users, devices and user agents. In order to increase consistency across the whole site, formatting rules can be expressed by means of style sheets.
Here, we discuss content accessibility of the Web application of the Department of Electronics and Information (DEI) at Politecnico di Milano (www.elet.polimi.it), using the concept of Web marts for reconstructing its general organization. The application is a large institutional site consisting of a public part with approximately 9,000 page requests per day from external users, and an intranet area, supporting administrative procedures.
Figure 2 illustrates a fragment of the DEI application content, specified according to the E-R notation. The diagram highlights DEIMember as a core concept, with an access entity, MemberCategory, based on the partitioning of the department members into researchers, associate professors, and full professors. Detail entities include information about personal pages, courses, and publications.
ResearchArea is another core concept of the application. Research areas can be accessed through a classification based on the department sections. Their content is enriched by means of published materials (such as publications, links to project Web sites, and so forth), and descriptions of research projects. The relationship between DEIMember and ResearchArea gives an example of interconnection between core concepts.
Other core concepts, not represented in the figure in the interest of brevity, are concerned with teaching activities and industrial relationships.
At the level of coarse design, the hypertext of the public part of the site is organized in four areas, publishing contents of the four core concepts that characterize it. Each application page includes a global navigation bar in its top region, whose links enable the access to the four application areas (Research, Teaching, Industry, and People). Also, pages in a given area include a local navigation bar with links to some relevant area pages.
Figure 3 (a) shows the layout of a DEI member's page. Besides the two navigational bars (global and local) for landmark navigation, the page shows information about the core concept DEI Member (clustered in the central area). The DEI member details are represented by one index pointing to personal pages (with two entries), and by some links leading to publications and course materials. A one-entry index, pointing to research areas, represents the interconnection with the Research Area core concept.
The model of the DEI member page is represented in Figure 3(b). It includes a data unit, two links pointing to auxiliary pages, two index units, and one link enabling the interconnection with the DEI member research areas. Landmarks are properties of areas and pages reachable through links in the navigation bars; therefore landmarks are not shown within the design of this page. Figure 3(b) is thus the "abstract model," and Figure 3(a) is one of its possible "rendering" obtained by superimposing a presentation style to it.
The page model maintains a clear correspondence with the Web mart: the page concentrates on publishing the core contents and some of the details of a DEI Member, while providing navigation mechanisms for moving toward other detail concepts. Indeed, from a Web mart it is possible to derive "patterns" of Web pages whose primary function is to publish its components. A correspondence with Web marts can be exploited by navigation aids, such as Web readers; for every content unit in a page, it is possible to say what is the underlying data content being displayed and what is the role of the unit in the Web mart.
The concept of Web mart can also be used for a critical analysis of content accessibility. For example, consider the visibility of publications within the DEI Web site. At the data level, the entity Publication has been conceived as a detail of the DEIMember conceptnot as a core concept (see Figure 2). At the hypertext level, publications are therefore reachable only from the page of each DEI member (see Figure 3(a)). This design choice prevents users from accessing DEI publications easily, since their discovery is conditional upon access to a DEI member's page. In the specific example, given the relevance of access to DEI publications, this can be considered a lack of content accessibility.
To overcome this limitation, a direct search unit over all publications was added in the publication page, but this was not really effective. Through a Web usage analysis performed over 15 days of Web logs, we noted that only two out of 440 accesses to publications were through this mechanism. Based on this feedback, the DEI Web application is currently under revision. In general, the omission of a Web mart, and specifically of navigational access to its core concepts, cannot be compensated by the injection of ad-hoc direct access mechanisms.
Recent works on accessibility have especially focused on impaired users, being mainly centered on presentation, independently from other factors, such as the quality and usability of Web applications [6]. Luckily, a considerable part of the Web Engineering and HCI research community considers accessibility a universal concept, which can be beneficial for any user (and not only for impaired ones). Accessibility can be in fact regarded as a general concept, which must accommodate for technology variety and user diversity, especially focusing on the ease of content retrieval and access [7, 9]. Under these premises, in the light of some previous works on usability and accessibility, we explain why content accessibility is so relevant.
According to Nielsen [8] and Shneiderman [9], an intuitive application should not require prior experience with other applications of the same kind. Users should be able to apply previous mental models to interact with any new application. The discovery of Web self-similarity [4] is therefore the best argument supporting our claim that Web marts are indeed the simplest model of interaction, the one that requires less prior experience from its users. Also, the notion of core object as a central entity with a rich set of properties matches with the classical structure of concepts defined by cognitive psychology as a way to articulate knowledge by human beings [1]. While using a known pattern for content description is quite useful, users should not experience dissonance between their expectations and the content retrieved, otherwise they would not be able to reformulate their goals and start a new search. Thus, it is essential that search methods return exactly the core objects that are being searched, in a format conformant to expectations.
A well-established design practice is to adopt hierarchical information structures and index navigation aids to support people in top-down research strategies [8, 10]. However, key-based strategies are useful when top-down strategies fail. This argument supports the building of both navigation and search access to pages, with emphasis on ease of navigation; they must be carefully designed as part of the Web mart.
Human cognitive resources are intrinsically limited. To avoid cognitive overloads that may interfere with the users' process of reaching their goals, information must be carefully designed, so as to maximize its internal consistency at the content level, which is certainly as important as presentation. The notion of a Web mart is quite simple and intuitive, but it can be very effective in the design of data-intensive Web applications. It can be supported by repeatable design patterns and induces several good design practices.
1. Bruner, J., Goodnow, J.J., and Austin, G.A. A Study of Thinking. Transaction Publications Inc., 1995.
2. Ceri, S. et al. Designing Data-Intensive Web Applications. Morgan Kaufmann, 2003.
3. Chen, P.P. The Entity-Relationship model: Toward a unified view of data. ACM Transactions on Database Systems (TODS) 1, 1 (Jan. 1976), 936.
4. Dill, S. et al. Self-similarity in the Web. In Proceedings of VLDB 2001 (Rome, Italy, Sept. 2001), 6978.
5. Fraternali, P. Tools and approaches for developing data-intensive Web applications: A survey. ACM Computing Survey 31, 3 (Mar. 1999), 227263.
6. Hudson, W. Inclusive design: Accessibility guidelines only part of the picture. Interactions 11, 4 (July 2004), 5556.
7. Hull, L. Accessibility: It's not just for disabilities any more. Interactions 11, 2 (Mar. 2004), 3641.
8. Nielsen, J. Web Usability. New Riders, 2000.
9. Shneiderman, B. Universal usability. Commun. ACM 43, 5 (May 2000), 8491.
10. Shneiderman, B., Byrd, D., and Croft, W.B. Sorting out searching. Commun. ACM 41, 4 (Apr. 1998), 9598.
11. Web Content Accessibility Guidelines 2.0. W3C-WAI Working Draft, March 2004.
Figure 1. Data marts and Web marts, represented according to the E-R notation. Boxes denote entities (classes of information objects); lines denote relationships (classes of associations among information objects). The two values specified over a relationship at the side of an entity represent constraints on the minimum and maximum number of entity objects that can participate in the relationship.
Figure 2. A fragment of the DEI E-R data schema.
Figure 3. A rendering of the DEI member page in the DEI Web site (a), and its abstract conceptual model (b).
©2007 ACM 0001-0782/07/0400 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2007 ACM, Inc.
No entries found