08 secondary disclosure testing

© 2019 - State of Utah - Department of Technology Services
Jump to navigation Jump to search

Secondary Disclosure Testing

There are several different definitions and viewpoints about what constitutes “secondary” disclosure testing. The EXPO-QCEW definition is that when precisely one macro aggregate at a specific industry level is non-disclosable, out of all of the codes that make up the next-higher industry aggregate level, then a second aggregate at the lower level will be flagged as non-disclosable (even though it is disclosable at the primary level) so that the higher-level aggregate can be preserved for disclosure. When one lower-level aggregate is non-disclosable, but all of the others in the group are disclosable, then the next-higher aggregate could be used with the disclosed lower-level values to calculate the information that is being masked.


An alternative to secondary masking by eliminating a low-level aggregate is to mask the next higher level aggregate instead. In fact, at one time, this method was utilized by EXPO. However, in general, this method blocks more data and is less useful, especially with high-level totals (sub-sector or sector totals). So it was dropped in favor of the lower-level “pseudo masking” technique.


The following table shows an example of a set of aggregates for which just one is not disclosable, in this case due to too few firms.


Secondary disclosure testing 01.png


The one non-disclosable 4-digit NAICS aggregate is ‘5173’, since there are only two U-I account numbers involved in the total. If only this value was masked, the 3-digit (‘517’) sub-sector total could reveal the missing information by simple subtraction. For instance, the ‘5173’ Month 1 employment would be found via =


Mon15173 = Mon1517-Mon15171-Mon15172-Mon15174-Mon15175-Mon15179

OR Mon15173 = 843 - 173 - 202 - 191 - 79 - 169 = 29


To prevent such deductions, another of the 4-digit aggregates needs to be masked as though it were non-disclosable as well. In this case, however, it is a secondary disclosure masking in order to protect the sub-sector total from infiltration. There are a few ways a secondary can be selected. One is to mask the one that has the fewest firms present. In the case above, however, this would take out the 4-digit NAICS that includes the U-I account with the highest employment of the entire sub-sector (i.e., 5171). This would also remove about 20% of the employment, leaving a large hole in the data representation. Another method may be to select the cell with the smallest “big” employer (which would be NAICS 5179 with a 12-employee largest U-I). Yet this would take out about the same amount as the first method, since there are so many establishments involved in the aggregate.


Secondary disclosure testing 02.png


The method selected for EXPO use is to find the aggregate with the smallest third-month employment among the disclosables, then mask that group along with the primary-disclosure masked selection. That would mean NAICS 5175, with 79 employees in the third month. By masking this total along with the 5173 primary, the chart would look like the one above (orange identifying the secondary masking, pink the primary.


Note: The big employer data is not shown in the output, so the masking shown here is meaningless.


From these values it would be possible to determine that 109 employees were represented in the combination of 5173 and 5175, but the specific nature of the two aggregates could not be ascertained. This satisfies the requirements for disclosure masking.


Related Links