Skip to main content

XForms and P3P

Help users manage their privacy preferences

Nicholas Chase (ibmquestions@nicholaschase.com), Freelance writer, Backstop Media
Nicholas Chase has been involved in Web site development for companies such as Lucent Technologies, Sun Microsystems, Oracle, and the Tampa Bay Buccaneers. Nick has been a high school physics teacher, a low-level radioactive waste facility manager, an online science fiction magazine editor, a multimedia engineer, an Oracle instructor, and the Chief Technology Officer of an interactive communications company. He is the author of several books, including XML Primer Plus (Sams).

Summary:  Because of the rise of identity theft, online privacy has become a big issue. Many sites have privacy policies in place, but who has time to read and decipher each one as you do your daily surfing? Fortunately, there is an easier way. The Platform for Privacy Preferences, or P3P, provides a standard way for sites to define the information they collect, which makes it possible for tools to do the deciphering for you. Because XForms is so often used to collect personal information, it is crucial that it be included in this process. This article explains how the Platform for Privacy Preferences works, and how to integrate your XForms with it.

Date:  14 Nov 2006
Level:  Intermediate
Activity:  1020 views

Is privacy dead?

The subject of privacy will be debated for years to come, but the fact is that while some people maintain that there is no longer any such thing as privacy and you should just get accept that, efforts are still being made to maintain some form of privacy, especially outside of the United States. In Europe, for instance, rules about what information can be collected and retained are much more strict than they are in the US, and the growing globalization of commerce means that nobody can ignore this trend.

Even in the United States, companies are required to provide notice of their privacy policies to customers when collecting data under various conditions. You've probably received dozens of them in the mail, from your bank, from your insurance company, and so on. You may even have read some of them.

You are besieged by privacy statements online, as well, to the point where many people simply don't read them. Perhaps they assume that because the company has a privacy policy, then as customers they are entitled to some form of privacy. In fact, all the privacy policy does is define what information the company collects, why it is collected, and what can be done with it. A privacy policy may state that a site collects information about everything you do on their site, plus your name, address, phone number, birthday, and the name of your childhood imaginary friend, all of which will be retained forever and posted on the public Internet for anyone to see. That's perfectly legal; they just have to tell you they're going to do it.

So it would certainly be more convenient to have the ability to tell your browser to look for certain characteristics of a privacy policy and notify you in situations in which the privacy policy is not acceptable, or is at least questionable. For example, you may be fine with the idea of giving out your phone number to a company that will use it only for the current transaction and will retain it for less than two weeks, but if the company is going to release that information to a third party, you may want to know that before you give your number in the first place. Perhaps it is required by the shipping company that will be delivering your order, in which case you don't have a problem supplying it, but you want to be notified first.

All of this is made possible by the standardization provided by P3P.


An example

Unfortunately, there are few really complete P3P implementations on the market. P3P will eventually be included in Mozilla browsers such as Firefox, but for now you are pretty much limited to downloading a separate implementation. For example, you can download Privacy Bird, which integrates with Microsoft® Internet Explorer to help you manage your surfing experience. It does this by collecting information on what you do and do not wish to allow, and then monitors the sites you visit to see whether they comply with your preferences. For example, you can download Privacy Bird and install it, and then set your preferences by right-clicking the little bird icon, as you can see in Figure 1.


Figure 1. Setting your preferences for Privacy Bird
Setting my preferences for Privacy Bird

As long as you are visiting sites that don't collect identifying information, Privacy Bird flaps its wings while the browser loads the page and then turns green to let you know that everything is all right, as you can see in Figure 2.


Figure 2. Privacy Brid lets you know everything is okay
Privacy Brid flaps its wings and turns green

If, on the other hand you were to, say, institute a more onerous privacy policy on your site, Privacy Bird would tell you that too, as you can see in Figure 3.


Figure 3. Privacy Bird alerts you to a more onerous privacy policy
Privacy Bird alerts you to a more onerous privacy policy

In most cases, the bird will simply turn yellow to tell you that no privacy policy is in place, but that is changing as sites begin to provide this information in machine-readable form.


Locating a privacy policy

In order to provide information in machine-readable form, the machine needs to know where to find the information. P3P specifies that each site should have a "policy reference file," which specifies not only the policy file in effect, but the parts of the site to which it applies. For example, a simple policy reference file for Chaos Magnet might look like the one shown in Listing 1.


Listing 1. The policy reference file

<META xmlns="http://www.w3.org/2002/01/P3Pv1">

    <POLICY-REFERENCES>
        <POLICY-REF
 about="http://www.chaosmagnet.com/policy.p3p#BasicPolicy">
            <INCLUDE>/*</INCLUDE>
            <EXCLUDE>/vitamins/*</EXCLUDE>
            <COOKIE-INCLUDE name="*" value="*" domain="*" path="*"/>
        </POLICY-REF>
    </POLICY-REFERENCES>

    <POLICY-REFERENCES>
        <POLICY-REF 
about="http://www.chaosmagnet.com/policy.p3p#OrderFormPolicy">
            <INCLUDE>/vitamins/*</INCLUDE>
            <COOKIE-INCLUDE name="*" value="*" domain="*" path="*"/>
        </POLICY-REF>
    </POLICY-REFERENCES>

</META>

In this case, you're specifying that there's a policy located at the URL http://www.chaosmagnet.com/policy.p3p. That policy is in the form of an XML document, and includes a specific policy element identified as BasicPolicy. (You'll see the actual policy document in a moment.) That policy applies to all of the site with the exception of the vitamins directory, where there is an order form that would not comply with this policy. That directory has its own policy, as specified. Also, you've noted that all cookies comply with this policy. You also have the option to exclude specific cookies or types of cookies.

This policy reference file enables the user agent to find the policy, but how does it know where to find the policy reference file? P3P specifies four different ways to tell the browser or other user agent where to find the policy reference file. The most common (and the one that takes precedence) is to place the file at /w3c/p3p.xml. User agents will always know to look for it there. The second option is to use a link element in an HTML or XHTML page. Finally, you have the option to send the information as an HTTP header.

No matter how you do it, the policy reference file points to the same privacy policy, which defines the information you'll collect in the XForms form, as you'll see in a moment. But first, let's look at creating the privacy policy, which defines that information so we know what we need to specify in the form.


Creating a privacy policy

Once you know what you're actually going to collect, creating a privacy policy isn't actually all that difficult. The first step is to create the policy file itself. You may have noticed that most of the "long form" privacy policies you read have basically the same structure, such as this one, created with the IBM® alphaWorks® P3P Policy Editor (see Listing 2).


Listing 2. Privacy Policy
 
About Us
This is a privacy policy for Chaos Magnet. Our homepage on the Web is located 
at http://www.chaosmagnet.com. The full text of our privacy policy is available 
on the Web at http://www.chaosmagnet.com/privacypolicy.html Users may go to 
http://www.chaosmagnet.com/optinout.html for information on how to opt-in or 
opt-out of use of their information.
We invite you to contact us if you have questions about this policy. You may 
contact us by mail at the following address:
Nicholas Chase
123 Main St.
Anywhere, IN 12345 
USA
You may contact us by e-mail at ibmquestions@nicholaschase.com. You may call 
us at 212-555-1212. 

Dispute Resolution and Privacy Seals
We have the following privacy seals and/or dispute resolution mechanisms. If 
you think we have not followed our privacy policy in some way, they can help 
you resolve your concern.

Bob's Trustworthy Site Program: Customer service complaints should be sent 
to Bob.

Additional Information
This policy is valid for 1 day from the time that it is loaded by a client.
 
Data Collection
P3P policies declare the data they collect in groups (also referred to as
 "statements"). This policy contains 1 data group.
 
Group "Access log information"
We collect the following information:
Click-stream data

HTTP protocol elements
This data will be used for the following purposes:
Completion and support of the current activity.
Web site and system administration.
Research and development.
This data will be used by ourselves and our agents.
The following explanation is provided for why this data is collected:
Our Web server collects access logs containing this information.
  
Cookies
Cookies are a technology which can be used to provide you with tailored 
information from a Web site. A cookie is an element of data that a Web site can 
send to your browser, which may then store it on your system. You can set your 
browser to notify you when you receive a cookie, giving you the chance to decide 
whether to accept it.
We do not make use of HTTP cookies. 
 
Compact Policy Summary
P3P compact policies are a form of a P3P policy which summarizes what the 
policy says about cookies. Since this policy does not mention any use of 
cookies, there is no compact policy form of this policy.
A policy mentions use of cookies if the data element "HTTP Cookies" is in 
any group in the policy. This data element is found under "Dynamic data". 
 
Policy Evaluation
Microsoft Internet Explorer 6 will evaluate this policy's compact policy 
whenever it is used with a cookie. The actions IE will take depend on what 
privacy level the user has selected in their browser (Low, Medium, Medium High, 
or High; the default is Medium. In addition, IE will examine whether the 
cookie's policy is considered satisfactory or unsatisfactory, whether the cookie 
is a session cookie or a persistent cookie, and whether the cookie is used in a 
first-party or third-party context. This section will attempt to evaluate this 
policy's compact policy against Microsoft's stated behavior for IE6.
Note: this evaluation is currently experimental and should not be considered a 
substitute for testing with a real Web browser.
Satisfactory policy: this compact policy is considered satisfactory according to 
the rules defined by Internet Explorer 6. IE6 will accept cookies accompanied by 
this policy under the High, Medium High, Medium, Low, and Accept All Cookies 
settings.

Basically, this privacy policy is designed to tell the user what you are collecting, why you are collecting it, and what you're going to do with it. Of course, phrased in a natural language way like this, there is a lot of leeway. In other words, it would be difficult for a software application to know exactly what it is you are collecting if you just say "address." The P3P recommendation creates a very specific way of noting this information, and that it is the notation you are going to use with the XForms form in this article.


The machine-readable privacy policy

P3P requires that privacy information be encoded in an XML file that is accessible to the "user agent" (normally a browser). Each privacy policy has the same basic structure, which lays out the areas of information, as you can see in Listing 3.


Listing 3. The basic privacy policy

<?xml version="1.0"?>
<POLICIES xmlns="http://www.w3.org/2002/01/P3Pv1">
    <EXPIRY max-age="86400"/>

    <POLICY
        name="BasicPolicy"
        discuri="http://www.chaosmagnet.com/privacypolicy.html"
        opturi="http://www.chaosmagnet.com/optinout.html"
        xml:lang="en">
    
        <ENTITY>
            <DATA-GROUP>
                <DATA 
ref="#business.contact-info.telecom.telephone.number">212-555-1212
</DATA>
                <DATA 
ref="#business.contact-info.online.email">ibmquestions@nicholaschase.com
</DATA>
                <DATA
ref="#business.contact-info.online.uri">http://www.chaosmagnet.com
</DATA>
                <DATA 
ref="#business.contact-info.postal.organization">Nicholas Chase
</DATA>
                <DATA ref="#business.contact-info.postal.street">123 Main 
St.</DATA>
                <DATA 
ref="#business.contact-info.postal.city">Anywhere</DATA>
                <DATA 
ref="#business.contact-info.postal.stateprov">IN</DATA>
                <DATA 
ref="#business.contact-info.postal.postalcode">12345</DATA>
                <DATA 
ref="#business.contact-info.postal.country">USA</DATA>
                <DATA ref="#business.name">Chaos Magnet</DATA>
            </DATA-GROUP>
        </ENTITY>

        <ACCESS><ident-contact/></ACCESS>

        <STATEMENT>
            <DISPUTES-GROUP>
                <DISPUTES resolution-type="service" 
service="http://www.daily-moon.com" 
                                                 short-description="Bob's 
Trustworthy Site Program">
                    <LONG-DESCRIPTION>Customer service complaints should be 
sent to Bob.</LONG-DESCRIPTION>
                    
<REMEDIES><correct/><money/><law/></REMEDIES>
                </DISPUTES>
            </DISPUTES-GROUP>
    
            <CONSEQUENCE>Our Web server collects access logs containing 
this information.</CONSEQUENCE>

            
<PURPOSE><admin/><current/><develop/></PURPOSE>

            <RECIPIENT><ours/></RECIPIENT>

            <RETENTION><indefinitely/></RETENTION>

            <DATA-GROUP>
                <DATA ref="#dynamic.clickstream"/>
                <DATA ref="#dynamic.http"/>
            </DATA-GROUP>
        </STATEMENT>

    </POLICY>

    <POLICY
        name="OrderFormPolicy"
        discuri="http://www.burntstorelabs.com/privacypolicy.html"
        opturi="http://www.burntstorelabs.com/optinout.html"
        xml:lang="en">
...
    </POLICY>

</POLICIES>

Starting at the top of Listing 3, policies are valid for a set period of time. In this case, the policy is set to be valid for one day, or 86,400 seconds. Once the user agent reads the policy, it can assume that it is in effect for at least that long, so it doesn't have to read the policy again until it expires. The expiration time applies to any policies specified in the file.

The policy itself needs to be identifiable by name; this is the name specified by the fragment in the policy reference file. The policy also provides information on where to find the human readable form of the policy (the discuri), as well as the page to which users must go if the policy enables them to opt-in (meaning you won't use the information unless they specifically allow it) or opt-out (meaning that you will use the information unless they specifically disallow it). And, of course, you may provide your privacy policy and a multitude of languages, so the POLICY element enables you to define the language for which this information applies.

Next, you identify yourself as the entity collecting the information. Notice the contents of the ref attributes. These are structured pieces of information that specifically define data. It is these notations you will add to the XForms form in a moment.

You're specifying not the access you have to the user's data, but rather the access the user has to the data you've collected. This can range from no access at all, to partial access, to complete access, to no access because you're not collecting any identifiable information in the first place. Next comes the actual statement itself, which contains the actual information on what you collect and what you do with it.

It starts with information on what the user can do if he or she feels you have violated this privacy policy. Typically, this refers to groups such as TrustE, and specifies what you say you will do in response. The options are to correct the problem, pay money, or let the law decide the penalty (or in this case, all three).

The CONSEQUENCE doesn't so much spell out the consequences of breaking the privacy agreement as it does the consequences of not providing information. For example, in this case, the site needs the information because it tracks all activity in the server logs. If you don't wish to provide that information, you simply can't use the site. The consequences of not accepting cookies might be that the application will not work properly and you won't be able to use the shopping cart.

The PURPOSE element describes the reason you collect the information, such as for administration of the site, research and development, analysis, and so on. Options include anonymous analysis and individual analysis, and in an ideal world, users can choose the specific purposes for which they will allow their data to be used. One common option is to specify that the information is used only for the current transaction, for example in the case of a Web order.

The RECIPIENT specifies who actually gets the data. As noted here, the application only passes user information to you and your subsidiaries. Other options include shipping companies related to the transaction, third parties (not related to the transaction), and public disclosure, for example on a CD-ROM directory. The RETENTION element specifies how long you will hold the data.

Finally, the DATA-GROUP element specifies the actual information to be collected. Let's look at that structure, and how it applies to XForms.


The data

XForms data is grouped into several "data sets." Although it is possible to create your own data set, obviously the most commonly used are the "core" data sets: user, thirdparty, business, and dynamic. Each of these data sets has specific values, and most of those have their own structures. For example, you might create an XForms form that requests information about the user in order to create an account. In that case the form might include the following (see Listing 4).


Listing 4. The XForms form

...
<XForms:instance xmlns="http://www.example.org/userinfo">
<newuser>
     <fullname>
          <first></first>
          <middle></middle>
          <last></last>
     </fullname>
      <email></email>
     <address>
          <street></street>
          <city></city>
          <state></state>
          <postalcode></postalcode>
     </address>
     <loginInfo>
          <username></username>
          <password></password>
     </loginInfo>
</newuser> 
</XForms:instance>

<XForms:bind nodeset="/newuser/fullname/first"
 p3ptype="user.name.given" />
<XForms:bind nodeset="/newuser/fullname/middle"
 p3ptype="user.name.middle" />
<XForms:bind nodeset="/newuser/fullname/last"
 p3ptype="user.name.family" />
<XForms:bind nodeset="/newuser/loginInfo/username"
 p3ptype="user.login.id" />
<XForms:bind nodeset="/newuser/loginInfo/password"
 p3ptype="user.login.password" />
<XForms:bind nodeset="/newuser/email"
 p3ptype="user.home-info.online.email" />
<XForms:bind nodeset="/newuser/address/street"
 p3ptype="user.home-info.postal.street" />
...

By specifying the P3Ptype in a bind element, you are applying that information to the actual nodeset in the instance. No matter where the information is on the page, it will be recognized as that particular type.

Notice that there is a hierarchy here. You are supplying information about the user. That information consists of a name, which is defined as a "personname" structure, which includes information such as given, middle, family, prefix, and so on. Similarly, the home-info attribute is a contact structure, which itself has several attributes, including the address, which in this case is a postal structure including information on the street, city, and so on.

A complete listing of the types that are part of a basic data structure is beyond the scope of this article, but you can find it in tools such as the Privacy Policy Editor.


Summary

XForms provides an easy way to specify the information you're collecting in terms of standard data recognized by the Platform for Privacy Preferences. By specifying your data in this way, you are providing a service for both the user and for yourself. For the user, you enable automated tools to provide better information about the information you're collecting, which enables the user to make better informed decisions about what to do and whether to provide that information. But as for yourself, you're providing a way to associate your arbitrary data instances with specific pieces of information user agents (i.e., browsers) may one day cache. When that day comes, users will arrive at your form and find much of their information already present, prefilled by a browser that understands what you're looking for. That will lighten the user's load, and make him or her more likely to follow through on an inquiry or purchase. And that's good for you.


Resources

Learn

Get products and technologies

Discuss

About the author

Nicholas Chase has been involved in Web site development for companies such as Lucent Technologies, Sun Microsystems, Oracle, and the Tampa Bay Buccaneers. Nick has been a high school physics teacher, a low-level radioactive waste facility manager, an online science fiction magazine editor, a multimedia engineer, an Oracle instructor, and the Chief Technology Officer of an interactive communications company. He is the author of several books, including XML Primer Plus (Sams).

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=174553
ArticleTitle=XForms and P3P
publish-date=11142006
author1-email=ibmquestions@nicholaschase.com
author1-email-cc=dwxed@us.ibm.com

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers