Skip to main content

Contrast Crawler (preview)

Note

This feature is in preview mode and not generally available to all users. For access to this feature, contact Contrast support.

The Contrast Crawler is an application that helps you exercise a web application instrumented with a Contrast agent that has Assess turned on. The application exercises the web application’s request handlers and code paths Exercising the application in this way lets the agent detect any vulnerabilities in those areas.

If you don't have broad test coverage for your web applications, use the Crawler against these applications to get a more complete security analysis from Assess.

What does the Crawler do?

The Crawler retrieves a list of pages from an application, searches the pages for links to other application pages, and adds those pages to the queue to be retrieved and searched. The Crawler searches the retrieved pages for HTML forms and presses the contained Submit button; once with default input values and once with generated input data.

Application considerations

When selecting an application to to use with the Contrast Crawler, consider these conditions:

  • Currently, the Crawler does not exercise API endpoints.

  • In addition to searching application pages for links, the Crawler searches returned pages for HTML forms and can fill form inputs and POST them if the form contains a button element with type=submit

    Modern web applications, particularly Single Page Applications (SPAs), may trigger data submissions using Javascript event handlers not attached to a Submit button. Data posting from these applications is not currently supported.

The Crawler workflow

  • The Crawler retrieves a list of route URLs from Contrast and uses that list to initialize a request queue. It renders the retrieved pages into a Document Object Model (DOM).

  • The Crawler searches a retrieved page’s DOM for links (HTML anchor elements) and forms (HTML form elements) for other pages in the application. It adds the URLs to the queue of pages to retrieve, if they are not already in the list of pages.

  • For each form that the Crawler finds, it does two submissions: one with the default data for the form input and a second with the form inputs filled with generated data. It submits the form by simulating a button press on a button element in the form with type submit.

    You might be able to improve results for a particular application by adjusting formats for dates and telephone numbers, or adding hints to specify what kind of data should be supplied for particular input names. You can make these adjustments by using a configuration file called faker.json.

  • The Contrast Crawler then repeats the retrieval and submission process. If any form submissions from the first crawl create new resources, the Crawler can retrieve and exercise the URLs to the new resources in the second crawl.

Authentication

You can configure the Contrast Crawler to support many web applications requiring authentication. To use this feature, record an authentication script (Playright provides this functionality). You record the script as a user connects to the application with a Playwright instrumented browser. The user must log in to the application, log out of the application, and select two page elements (indicators), one visible only when logged in, and another visible only when logged out.

When you configure the Crawler to support authentication, it does two separate crawls internally:

  1. All page retrievals are attempted in an authenticated session.

  2. Next, page retrievals are attempted in an unauthenticated browser session.

The Crawler uses session indicator elements to verify the browser authentication state. If the current browser session is not in the desired state, the Contrast page handler can execute the login or logout script as required and resubmit the request.

Next steps