Pittsburgh PLI/DOMI/ES Violations Report

This dataset originally housed Department of Permits, Licenses, and Inspections violations (2015-2020), and has now been expanded to include violations logged by other units (including DOMI) from 2020-06-01 until the present. These data are used to manage and track all updates to casefiles by city employees and can be used to understand when citations/investigations/court proceedings are issued, the nature and location of the violation, and the status of the casefile at any point in time. By using addresses or parcel numbers, which are contained in these data, users can also display information on geospatial maps.

Collection/Interpretation

It is important to understand the distinction between violations and casefiles, and how updates to a casefile are represented in the dataset. A casefile refers to one or more violations. When an initial investigation is conducted each of these violations is recorded separately. The investigation will result in a new status for all of violations ("VIOLATIONS FOUND"). The subject of the investigation will be informed of this outcome and must address the problem(s). There will be a follow-up inspection at this point, and depending on the results, further steps will be taken (follow-up investigations, criminal complaints issued, court proceedings, etc.)

Each violation for each casefile is represented as a unique row in the dataset. As explained above, there will be a minimum of two updates for each violation (the initial and follow-up investigation). Though the investigation of all violations in a casefile is conducted simultaneously, each investigation is represented as a unique row. Thus, for a property with three violations there will be a minimum of six rows (both investigations for each violation). It is possible to track the entire case history by observing all rows for each casefile.

Each violation is cited according to the violation_code_section field.

The casefile_number represents the only UUID for each casefile (the entire group of violations). By using the casefile_number and violation_code_section fields in combination, one can track the history each violation for a given casefile. Combining the above fields with investigation_date renders a UUID for each record.

DOMI (the Department of Mobility and Infrastructure), PLI, and Environmental Services (ES) all use this system to log violations. In most cases, the department involved in the casefile can be extracted from the casefile_number field (beginning with the 4th character). For instance, a casefile_number like CF-PLI-2021-025422, represents a violation reported by PLI. The remaining casefile IDs start with "O-"; these are PLI violation codes from an old ticketing system.

The records from 2020-06 onward are obtained from the City's Computronix system, one of several independent systems used by the City to track property-level data.

Preprocessing/Formatting

All string text (most fields) were converted to UPPERCASE data. The data are manually entered and often contain non-uniform formatting. While several solutions for cleaning the data exist, including allowing the user to clean the data after accessing it here, text field values were transformed to UPPERCASE to ensure the data were uniformly formatted in this case. Future improvements to this ETL pipeline may approach this problem with a more sophisticated technique.

Data ja resurssit

Lisätietoja

Kenttä Arvo
Public Access Level Comment
Temporal Coverage 2020-06-01/present
Geographic Unit Street Address
Data Notes

The Department metadata field is single-valued, so it has been set to "Department of Permits, Licenses, and Inspections" (PLI), but it would be more correct to associate this data with three City departments: PLI, DOMI, and Environmental Services.

Related Document(s)
Frequency - Data Change Daily
Frequency - Publishing Daily
Data Steward
Data Steward Email ip.analytics@pittsburghpa.gov