The data from the Harris County dashboard originates with the Harris County District Clerk. We start with their monthly dispositions report.

We transform these data in the following ways:

  • Geocode the Home Address fields and jitter them to obscure the exact location for mapping.
  • Calculate the Age at Filing by subtracting the Date of Birth from the Filing Date.
  • Cross reference the Court Number and fill in the Judge Name who presided over that court in the year of Filing Date.
  • Standardize the Sentence Length into units of days. In the report, it is a text field that contains units of years, months, or days.

The dashboard attempts to represent these data as accurately as possible. There are minor exceptions that are not represented in this dashboard:

  • Bail amounts that are greater than 2x standard deviation from the mean are not represented on the charts, but are still taken into account in all other areas of the dashboard. This is to prevent charting extreme outliers and skewing the visual representation of bail trends.
  • Guest judges who are substituting for regular judges are unaccounted for in the data. In this dashboard, the Judge's name is a direct proxy for the Court number.
  • People marked as homeless, no known address, unknown address, and so forth, are not represented on the map. Their data is represented everywhere else in the dashboard.
  • Only the original sentence is represented in the dashboard trends. The dashboard removes duplicate case information for people who had their deferred adjudication or probation revoked.

The data available for download includes some of the original data from the Harris County District Clerk, as well as some of the transformed data. We have attempted to protect personally identifiable information in this dashboard and its file downloads. All of this information is publicly available, and researchers can reconstitute a complete dataset using the Case Number to validate any findings.


This project is supported by: