Skip to main content

skip to main content

developerWorks  >  Java technology  >

Filtering tricks for your Tomcat

The addition of filtering to the Servlet 2.3 spec offers enhanced performance for your J2EE apps

developerWorks
Document options

Document options requiring JavaScript are not displayed

Sample code


Rate this page

Help us improve this content


Level: Intermediate

Sing Li (westmakaha@yahoo.com), Author, Wrox Press

01 Jun 2001

One of the most exciting features of the new Java Servlet 2.3 specification is filtering. At first sight, Servlet 2.3 filtering is deceptively similar to existing legacy filters in Apache, IIS, Netscape Web servers and others. In reality, Servlet 2.3 filtering is a completely different design architecturally -- leveraging the object-oriented nature of the Java platform to provide a new level of performance. This article introduces you to filtering in Tomcat 4 and shows you how to make productive use of filters in your projects.

Filtering is a new feature of Tomcat 4. (For a brief history of Tomcat, see The Tomcat story). It is part of the Servlet 2.3 specification and will eventually be implemented by all J2EE container vendors supporting the standard. Developers will be able to use filters to implement features that used to be awkward or difficult to achieve, including:

  • Customized authentication to resource (Web pages, JSP pages, servlets) access

  • Auditing and logging of resource access on an application level

  • Application-wide encrypted access to resources, based on customized encryption schemes

  • On-the-fly transformation of accessed resources, including dynamic output from servlets and JSPs

This list is certainly not exhaustive, but gives a taste of the added value that filtering brings to the table. In this article, we will take a detailed look at Servlet 2.3 filtering. We will see how filters fit into the J2EE processing model. Unlike other legacy filtering schemes, Servlet 2.3 filtering is based on nested calls. We will examine how this difference is architecturally consistent with the new high-performance Tomcat 4 design. And finally, we will gain some hands-on experience coding and testing two Servlet 2.3 filters. These filters perform very simple functions, enabling us to focus on the mechanics of how to code filters and how to integrate them into Web applications.

Filters as Web application building blocks

Physically, filters are application-level Java code components within a J2EE Web application. In addition to servlets and JSP pages, developers coding to the Servlet 2.3 specification can use filters as a mechanism for adding active behaviors to their Web applications. Unlike servlets and JSP pages, which work at specific URLs, filters tap into the processing pipeline of the J2EE container and can work across a subset (or all) of the URLs served by a Web application. Figure 1 illustrates where filtering fits into the J2EE request processing.


Figure 1. Filters and J2EE request processing
Filters and J2EE request processing

A Servlet 2.3-compliant container enables filters to access a Web request before the request is processed (by the Servlet engine) as well as after the request is processed (the filter will have access to the response).

At these points, a filter can:

  • Modify the request headers before the request is processed
  • Supply its own version of a request to be processed
  • Modify the response, after request processing, before passing it back to the user
  • Pre-empt request processing by the container altogether and generate its own response

Prior to the availability of filters, tapping into the J2EE processing pipeline required creating non-portable, container-specific, system-wide extension mechanisms (such as Tomcat 3 interceptors).



Back to top


Conceptual Tomcat filtering

Unlike the familiar filtering mechanism found in Apache, IIS, or Netscape servers, Servlet 2.3 filters are not based on hooked function calls. In fact, the Tomcat 4-level engine architecture is a departure from the legacy Tomcat 3.x versions. Instead of a monolithic engine calling out to hooked methods at the different phases of request processing, the new Tomcat 4 engine uses a flow of nested calls and wrapped requests and responses internally. The different filters and resource processors form a chain.

In the legacy architecture:

  • Hooked methods are called, regardless of their implementation (even when empty), on every request.

  • Method scoping and concurrency concerns (each method may be called on a different thread) do not allow any easy or efficient sharing of variables and information between the different hooked method invocations when processing the same request.

In the new architecture:

  • Nested method calls are made through a chain of filters, consisting only of filters that apply to the current request; legacy implementations based on hooked calls require a call to the hook routine at every processing phrase, even when the processing logic for a specific phrase does nothing at all.

  • Local variables are preserved and available until the actual filtering method returns (because the calls on upstream filters are always on the stack, waiting for downstream calls to return).

This new architecture provides a new, more object-friendly basis for future Tomcat performance tuning and optimization. Servlet 2.3 filters are a natural extension of this new internal architecture. The architecture provides Web application designers with a portable way of implementing filtering behavior.



Back to top


Chain of calls

All filters obey the filter chain of calls, enforced through a well-defined interface. A Java class implementing a filter must implement this javax.servlet.Filter interface. This interface contains three methods that a filter must implement:

  • doFilter(ServletRequest, ServletResponse, FilterChain): This is the method where the filter action is performed. It is also the method that the upstream filter calls in with. The incoming FilterChain object provides information on the downstream filter to call.

  • init(FilterConfig): This is an initialization method that the container calls. It is guaranteed to be called by the container before the first doFilter() invocation. You can obtain the initialization parameters specified in the web.xml file.

  • destroy(): The container calls this method before destroying the filter instance, after all activities in doFilter() have ceased for the instance.

Note: The method names and semantics for the Filter interface have been changing through the recent beta cycles. The Servlet 2.3 specification still has not reached final draft stage. In Beta 1, the interface included setFilterConfig() and getFilterConfig() methods instead of init() and destroy().

The nested call chaining occurs within the doFilter() method implementation. Unless you set up a filter to explicitly block all downstream processing (by other filters and the resource processor), the filter must make the following call within the doFilter method:

FilterChain.doFilter(request, response);



Back to top


Installing filters: Definitions and mappings

The container learns about the filters within a Web application through the deployment descriptor, web.xml file. There are two new tags associated with filters: <filter> and <filter-mapping>. You should specify them as children of the <web-app> tag inside the web.xml file.

Elements of the filter definition

The <filter> tag is a filter definition, and it must have a <filter- name> and <filter-class> subelement. The <filter-name> subelement gives a text-based name associated with this instance of the filter. The <filter-class> specifies the actual class to load by the container. Optionally, you can include an <init-param> subelement to supply the filter instance with initialization parameters. For example, the following filter definition defines a filter called IE Filter:


Listing 1. Filter definition tag

When the container processes the web.xml file, it typically creates one instance of a filter for each filter definition found. This instance is used to service all the applicable URL requests; therefore, it is of utmost importance to code filters in a thread-safe fashion.

Filter mapping and subelements

The <filter-mapping> tag represents a filter mapping, specifying the subset of URL that the filter will act on. It must have a <filter-name> subelement corresponding to the filter definition where you can find the filter you want to map. Next, you specify the mapping using either a <servlet-name> or <url-pattern> subelement. The <servlet-name> specifies a servlet (defined elsewhere in the web.xml file) to which this filter applies. You can use <url-pattern> to specify a subset of URLs that the filter applies to. For example, the pattern /* is used to indicate that the filter mapping applies to every URL served in this application, while the pattern /dept/humanresources/* indicates that the filter mapping applies only to URLs specific to the human resource department.

The container uses these filter mappings to determine if a particular filter should participate for a specific request. Here is a filter mapping applying the IE Filter defined in Listing 1 to all URLs for an application:


Listing 2. Filter mapping tag


Back to top


Creating a simple filter

It is time to code our first filter. This filter is a trivial one, examining the request headers to determine if an Internet Explorer browser is being used to view the URL. If it is an Internet Explorer browser, the filter displays an "access denied" message. Although trivial in action, this example illustrates:

  • The general anatomy of a filter
  • A filter that examines header information before it reaches the resource processor
  • How you would code a filter that stops downstream processing based on some run-time detected condition (authentication parameters, originating IP, time of day, etc.)

The source code for this filter is located within the source code distribution (see Download) as IEFilter.java, part of the com.ibm.devworks.filters package. Let's work through the code for this filter now.


Listing 3. Implement the Filter interface

All filters must implement the Filter interface. We create a local variable to contain the filterConfig that is passed in by the container when it initializes the filter. This occurs sometime before the first call to doFilter().


Listing 4. The doFilter method

doFilter() is where most of the work is done. We examine the request header called "User-Agent" header. All browsers supply this header. We convert it to lowercase, and then check for the telltale "msie" identification string. If an Internet Explorer browser is detected, we obtain a PrintWriter from the response object to write out our own response. After writing out the custom response, the method returns without chaining to any other filters. This is how a filter can block downstream processing.

If the browser is not Internet Explorer, we continue and perform the normal chaining, giving downstream filters and processors a chance at the request:


Listing 5. Perform normal chaining


      chain.doFilter(request, response);
    }

Then, we implement the init() and destroy() methods in this filter trivially:


Listing 6. The init() and destroy() methods

    public void destroy() {
    }
 public void init(FilterConfig filterConfig) {
	this.filterConfig = filterConfig;
    }
}



Back to top


Testing IEFilter

Assuming that you have Tomcat 4 beta 3 (or later) installed and operational, follow these steps to get IEFilter up and running:

  1.        <!-- Tomcat Examples Context -->
            <Context path="/examples" docBase="examples" debug="0"
                     reloadable="true">
           ...
             </Context>
            <Context path="/devworks" docBase="devworks" debug="0"
                     reloadable="true">
              <Logger className="org.apache.catalina.logger.FileLogger"
                         prefix="localhost_devworks_log." suffix=".txt"
            	  timestamp="true"/>
             </Context>
    

  2. <web-app>
        <filter>
            <filter-name>IE Filter</filter-name>
            <filter-class>com.ibm.devworks.filters.IEFilter</filter-class>
        </filter>
        <filter-mapping>
            <filter-name>IE Filter</filter-name>
    	<url-pattern>/*</url-pattern>
        </filter-mapping>
    </web-app>
    

  3. Create a new directory called devworks under the $TOMCAT_HOME/webapps directory, and copy everything under the devworks directory (including all subdirectories) from the source code distribution to this location. Now you're ready to start Tomcat 4.

  4. Use the following URL to access a simple index.html page:
    http://<hostname>/devworks/index.html

    If you use Internet Explorer, you should see the custom "access denied" message as shown in Figure 2.


    Figure 2. IEFilter at work with Internet Explorer
    IEFilter at work with Internet Explorer

    If you use Netscape, you'll see the actual HTML page, as shown in Figure 3.


    Figure 3. IEFilter pass-through with Netscape browser



Back to top


Writing filters that transform resources

It's now time to attempt a more complex filter. This filter:

  • Reads a set of "search" and "replace" text from the instance initialization parameters in the filter definition

  • Filters the URL being accessed, replacing the first occurrence of the "search" text with the "replace" text

As we work through this filter, you will become familiar with the structure of a content transform/replacement filter. The identical structure can be used in any encryption, compression, and transformation (such as XML via XSLT) filters.

The core secret is to pass a customized wrapper version of a response object downstream during chaining. This custom wrapper response object must hide the original response object (thus wrapping it) and provide a custom stream for the downstream processors to write into. If the work (text replacement, transformation, compression, encryption, etc.) can be performed on the fly, the custom stream implementation should intercept the downstream writing and perform the work necessary. The custom stream would then write the transformed data into the wrapped response object (that is, simple character-replacement encryption). If the work cannot be performed on the fly, the custom stream must wait until the downstream processor finishes writing to the stream (that is, when it closes or flushes the stream). It then performs the transformation and writes the transformed output to the "real" response.

In our filter (ReplaceTextFilter), the customized wrapper response object is called ReplaceTextWrapper. The custom stream implementation is called ReplaceTextStream. You will find the source code in the ReplaceTextFilter.java file in the com.ibm.devworks.filters package (see Resources). Let's examine the source code now.


Listing 7. ReplaceTextStream class

class ReplaceTextStream extends ServletOutputStream {
     private OutputStream intStream;
     private ByteArrayOutputStream baStream;
     private boolean closed = false;
     private String origText;
     private String newText;
     public ReplaceTextStream(OutputStream outStream, 
                              String searchText, 
                              String replaceText) {
          intStream = outStream;
          baStream = new ByteArrayOutputStream();
          origText = searchText;
          newText = replaceText;
     }

This is the code for our customized output stream. The intStream variable contains a reference to the actual stream from the response object. baStream is our buffered version of the output stream, the one into which downstream processors will be writing. The closed flag indicates whether close() has been called already on this stream instance. The constructor stores away the stream reference from the response object and creates the buffer stream. It also stores the text string for later replacement operations.


Listing 8. The write() method


    public void write(int i) throws java.io.IOException {
         baStream.write(i);
     }

Deriving from ServletOutputStream, we must provide our own write() method. Here, of course, we write to the buffered stream. All higher level output methods from downstream processors will be using this method at the lowest level, guaranteeing that all writes will be to our buffered stream.


Listing 9. The close() and flush() methods

    public void close() throws java.io.IOException {
          if (!closed) {
            processStream();
            intStream.close();
            closed = true;
          }
       }
    public void flush() throws java.io.IOException {
        if (baStream.size() != 0) {
             if (! closed) {
              processStream();           // need to synchronize the flush!
              baStream = new ByteArrayOutputStream();
              }
           }
        }

The close() and flush() methods are where we perform the transformation. Depending on downstream processors, either method or both may be called. We use the boolean closed flag to avoid anomalies. Note that we delegate the actual replacement work to the processStream() method.


Listing 10. The processStream() method

    public void processStream() throws java.io.IOException {
         intStream.write(replaceContent(baStream.toByteArray()));
         intStream.flush();
     }
 

The processStream() method writes the transformed output from the baStream to the intStream that it had been keeping around. The transformation work is isolated to the replaceContent() method.


Listing 11. The replaceContent() method


  public byte []  replaceContent(byte [] inBytes) {
          String retVal ="";
          String firstPart="";
          String tpString = new String(inBytes);
          String srchString = (new String(inBytes)).toLowerCase();
          int endBody = srchString.indexOf(origText);
          if (endBody != -1) {
               firstPart = tpString.substring(0, endBody);
           retVal = firstPart + newText + 
                  tpString.substring(endBody + origText.length()); 
            } else {
              retVal=tpString;
            }
          return retVal.getBytes();
    }
}

replaceContent() is where the search and replace occurs. It takes a byte array as input and returns a byte array, creating a very clean conceptual interface. In fact, we can perform any type of transformation by replacing the logic within this method. Here, we perform the very simple text substitution.


Listing 12. The ReplaceTextWrapper class

class ReplaceTextWrapper extends HttpServletResponseWrapper {
    private PrintWriter tpWriter; 
    private ReplaceTextStream tpStream;
    public ReplaceTextWrapper(ServletResponse inResp, String searchText,
                              String replaceText) 
                              throws java.io.IOException { 
         super((HttpServletResponse) inResp);
         tpStream = new ReplaceTextStream(inResp.getOutputStream(), 
                                          searchText, 
                                          replaceText);
         tpWriter = new PrintWriter(tpStream);
    }
    public ServletOutputStream getOutputStream() throws java.io.IOException {
            return tpStream;
     }
    public PrintWriter getWriter() throws java.io.IOException {
            return tpWriter;
     }
}

Our custom wrapper response conveniently derives from the helper class HttpServletResponseWrapper. This class trivially implements many of the methods, allowing us to simply override the getOutputStream() method and the getWriter() method, supplying an instance of our custom output stream.


Listing 13. The ReplaceTextWrapper() method

public final class ReplaceTextFilter implements Filter {
    private FilterConfig filterConfig = null;
    private String searchText = ".";
    private String replaceText = ".";
    public void doFilter(ServletRequest request, ServletResponse response,
                         FilterChain chain)
	throws IOException, ServletException {
      ReplaceTextWrapper myWrappedResp = new ReplaceTextWrapper( response, 
        searchText, replaceText);
         chain.doFilter(request,  myWrappedResp);
         myWrappedResp.getOutputStream().close();  
    }
    public void destroy() {
    }

Finally, there's the filter itself. It does little more than create a custom wrapper response instance for handing the response downstream using the FilterChain, as shown here:


Listing 14. Creates a custom wrapper response instance
    public void init(FilterConfig filterConfig) {
      String tpString;
      if (( tpString = filterConfig.getInitParameter("search") ) != null)
           searchText = tpString;
      if (( tpString = filterConfig.getInitParameter("replace") ) != null)
           replaceText = tpString;
	this.filterConfig = filterConfig;
    }
}

In the init method, we retrieve the initial parameters specified in the filter definition. The getInitParameter() method from the filterConfig object is handy for this purpose.



Back to top


Testing ReplaceTextFilter

Assuming you tested the IEFilter using the steps described earlier and have copied over all the files to $TOMCAT/webapps/devworks, you can test the ReplaceTextFilter by following these steps:

  1. <web-app>
    <filter>
     <filter-name>Replace Text Filter</filter-name>
     <filter-class>com.ibm.devworks.filters.ReplaceTextFilter</filter-class>
     <init-param>
        <param-name>search</param-name>
        <param-value>cannot</param-value>
     </init-param>
     <init-param>
        <param-name>replace</param-name>
        <param-value>must not</param-value>
     </init-param>
    </filter>
    <filter-mapping>
        <filter-name>Replace Text Filter</filter-name>
        <url-pattern>/*</url-pattern>
    </filter-mapping>
    </web-app>
    

  2. Restart Tomcat.

  3. Now, use the following URL to access the index.html page:
    http://<host name>:8080/devworks/index.html

Note how the ReplaceTextFilter has changed the word cannot into must not on the fly. To convince yourself that filtering works with all resources, you may want to try writing JSP pages or servlets that have outputs containing the string cannot.



Back to top


The importance of filter chain ordering

The order of filter chaining is determined by the order of the <filter-mapping> statements within the web.xml descriptor. In most cases, the order of filter chaining is important. That is, applying filter A before filter B will give entirely different results than applying filter B before filter A. If you're using more than one filter in an application, be careful when entering the <filter-mapping> statement.

We can easily see this effect by arranging the <filter-mapping> within our web.xml file as:


Listing 15. Order of filtering -- IE Filter first
<web-app>
<filter-mapping>
        <filter-name>IE Filter</filter-name>
	<url-pattern>/*</url-pattern>
    </filter-mapping>
<filter-mapping>
        <filter-name>Replace Text Filter</filter-name>
	<url-pattern>/*</url-pattern>
    </filter-mapping>
</web-app>

Now, load the index.html page with Internet Explorer. You will see that because the IE Filter is the first in the chain, the Replace Text Filter does not get a chance to execute. Therefore, the output is the message "Sorry, page cannot be displayed!"

Now, reverse the order of the <filter-mapping> tags to:


Listing 16. Order of filtering -- Replace Text Filter first

<web-app>
<filter-mapping>
        <filter-name>Replace Text Filter</filter-name>
	<url-pattern>/*</url-pattern>
</filter-mapping>
<filter-mapping>
        <filter-name>IE Filter</filter-name>
	<url-pattern>/*</url-pattern>
</filter-mapping>
</web-app>

Load the index.html page with Internet Explorer again. This time, the Replace Text Filter executes first inbound, supplying the IE Filter with its wrapped response object. After the IE Filter has written its custom response, the specialized response object transforms the output before it reaches the end user. Therefore, we see this message instead: Sorry, page must not be displayed!



Back to top


Deploying filters in your applications

At the time of this writing, Tomcat 4 was in a very late beta cycle with the official release pending shortly. Leading J2EE container vendors are all ready to incorporate the Servlet 2.3 specification into their products. Having a fundamental understanding of how Servlet 2.3 filters work enables you to add yet another versatile tool to your arsenal when designing and coding J2EE-based applications.




Back to top


Download

DescriptionNameSizeDownload method
Sample codej-tomcatsource.zip8 KBHTTP
Information about download methods


Resources



About the author

Photo of Sing Li

Sing Li is the author of Professional Jini and numerous other books with Wrox Press. He is a regular contributor to technical magazines and is an active evangelist of the P2P revolution. Sing is a consultant and freelance writer and can be reached at westmakaha@yahoo.com.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top