Note: [Mar 23, 2017] This website was migrated to a new platform recently. Some linked content may not be accessible until all the links are configured properly,

Java Programmatic Browser

Submitted by Kamal Wickramanayake on July 19, 2010 - 10:54

At times you need to visit web sites, login, navigate through pages, select portions of HTML, click on links, check for the existence of a form, submit the form,.... and do all these things programmatically. So you need a programmable web browser that can execute and have a cup of tea while it will do the job.

Java programmatic browser

Java SE API has the HTMLEditorKit that you can use to parse HTML pages. I have used it once. But it's very limited in capabilities. It's for parsing, but not to implement simple or complex navigation scenarios.

I have been a lover of httpunit to do things of this nature. I have used it to navigate through pages and fetch content in really complex scenarios. It's so powerful that it even knows how to execute JavaScript. httpunit is a JUnit extension. So you can use it to write JUnit test cases for your project. Let's not worry about test cases. Let's look at how to use it for some simple navigation and parsing the response pages. Look at the code sample given below. Follow the comments to understand what each important line does.

package org.swview.mybrowser;

import java.io.IOException;

import org.xml.sax.SAXException;

import com.meterware.httpunit.HttpUnitOptions;
import com.meterware.httpunit.TableCell;
import com.meterware.httpunit.WebConversation;
import com.meterware.httpunit.WebLink;
import com.meterware.httpunit.WebResponse;
import com.meterware.httpunit.WebTable;

public class ProgrammaticBrowser {

    /**
     * @param args
     */
    public static void main(String[] args) {
        // Don't throw exceptions when JavaScript errors occur
        HttpUnitOptions.setExceptionsThrownOnScriptError(false);
       
        // Here's the browser
        WebConversation wc = new WebConversation();

        try {
            // Fetch a page
            WebResponse response = wc.getResponse("http://www.swview.org/");
           
            // Get the link with the text "Contact"
            WebLink link = response.getLinkWith("Contact");
           
            // Click on the link and get the next response
            response = link.click();
           
            // Get all tables
            WebTable[] tables = response.getTables();
           
            // Get the first table
            WebTable firstTable = tables[0];
           
            // Get the cell at first row, second column
            TableCell emailCell = firstTable.getTableCell(0, 1);
           
            // Print the content as text
            System.out.println(emailCell.getText());
           
        } catch (IOException e) {
            e.printStackTrace();
        } catch (SAXException e) {
            e.printStackTrace();
        }        
    }
}

Given that httpunit.jar and the other jar file dependencies of httpunit are added to the class path, you can execute the above like this (after compilation):

java org.swview.mybrowser.ProgrammaticBrowser

You can even manipulate JavaScript generated content. For example, you can click on a JavaScript generated button. You can even navigate to another browser window that may pop up when you click on a link. I love it!

See below to download this simple application as an Eclipse project with httpunit libraries packed inside.