⚠ This page is served via a proxy. Original site: https://github.com
This service does not collect credentials or authentication data.
Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 30, 2026

Fix DataFilter False Positives for UUIDs and ObjectIds ✅

Problem

The DataFilter incorrectly flagged UUIDs and MongoDB ObjectIds as credit card numbers (PANs), replacing them with [filtered]. This happened when:

  • The value contained mostly digits
  • After removing non-digit characters, it was 13-16 digits long
  • It started with digits that match credit card patterns (2, 4, 5, certain 3/6 patterns)

Root Cause

The filterPanNumbers method stripped all non-digit characters and then checked against PAN regex patterns, which caused false positives for valid identifiers.

Solution

Added UUID and ObjectId detection before PAN checking:

  1. MongoDB ObjectId Regex: Matches 24 hexadecimal characters
  2. UUID Regex: Matches 32 hex characters with enforced dash consistency (all dashes in 8-4-4-4-12 format OR no dashes)

The filterPanNumbers method now:

  1. First checks if value is an ObjectId → keep it
  2. Then checks if value is a UUID → keep it
  3. Only then performs PAN detection → filter if match

Changes Made

  • Analyzed DataFilter implementation and identified false positive patterns
  • Added objectIdRegex to detect MongoDB ObjectIds
  • Added uuidRegex to detect UUIDs (enforcing consistent dash usage)
  • Updated filterPanNumbers to check UUID/ObjectId patterns first
  • Fixed tests to use values that would actually fail without the fix
    • UUIDs with exactly 16 digits that match PAN patterns when cleaned
    • ObjectIds with 16 digits that match PAN patterns when cleaned
  • Addressed code review feedback
  • Verified linting and build pass (pre-existing errors unrelated to changes)
  • Ran CodeQL security scan (0 vulnerabilities)

Test Coverage

Tests now use problematic values that would be incorrectly filtered without the fix:

  • ✅ UUID 4a1b2c3d-4e5f-6a7b-8c9d-0e1f2a3b4c5d (16 digits → Visa pattern)
  • ✅ UUID 5A1B2C3D-4E5F-6A7B-8C9D-0E1F2A3B4C5D (16 digits → Mastercard pattern)
  • ✅ ObjectId 4111111111111111abcdefab (16 digits → Visa pattern)
  • ✅ ObjectId 5111111111111111ABCDEFAB (16 digits → Mastercard pattern)

All tests would FAIL if lines 109-123 are commented out.

Security Summary

  • No security vulnerabilities introduced (CodeQL scan: 0 alerts)
  • Credit card filtering still works correctly
  • Sensitive key filtering still works correctly
  • Fix only prevents false positives for valid identifiers
Original prompt

This section details on the original issue you should resolve

<issue_title>grouper(data-filter):  falsy filtered values</issue_title>
<issue_description>Sometimes some values replaced by [filtered] by mistake. Probably, uuid or ObjectId could be mistaken for PAN.

Image

We need to:

  1. write tests that reproduce the problem
    1.1) ensure new tests are failing
  2. fix DataFilter
  3. ensure tests passed

Additional info:

  1. In example above, falsy filtered value stored in the GroupWorkerTask root-level projectId key which is inserted by Collector:

https://github.com/codex-team/hawk.collector/blob/0b11313918afba7e94028589bd1c3b3da0a7eb6c/pkg/server/errorshandler/handler.go#L97

I'm not sure whether bug is actual only for this field or not.

  1. We also have DataFilter class in Hawk.Laravel catcher.

https://github.com/codex-team/hawk.laravel/blob/27d8c9f542819db3aad67ed5bdefaa0061732b38/src/Services/DataFilter.php#L10-L15

I'm not sure if it caused by Hawk.Laravel. Maybe, but based on (1) it does not look so.

  1. I tried to write these tests but have not managed to reproduce the problem:
    test('should not filter UUID values', async () => {
      const uuidV4 = '550e8400-e29b-41d4-a716-446655440000';
      const uuidV4Upper = '550E8400-E29B-41D4-A716-446655440000';
      const uuidWithoutDashes = '550e8400e29b41d4a716446655440000';

      const event = generateEvent({
        context: {
          userId: uuidV4,
          sessionId: uuidV4Upper,
          transactionId: uuidWithoutDashes,
          requestId: uuidV4,
        },
        addons: {
          vue: {
            props: {
              componentId: uuidV4,
            },
          },
        },
      });

      dataFilter.processEvent(event);

      expect(event.context['userId']).toBe(uuidV4);
      expect(event.context['sessionId']).toBe(uuidV4Upper);
      expect(event.context['transactionId']).toBe(uuidWithoutDashes);
      expect(event.context['requestId']).toBe(uuidV4);
      expect(event.addons['vue']['props']['componentId']).toBe(uuidV4);
    });

    test('should not filter MongoDB ObjectId at root level of event', async () => {
      const objectId = '507f1f77bcf86cd799439011';
      /** ObjectId that contains only digits (could be mistaken for PAN) */
      const numericObjectId = '672808419583041003090824';

      const event = {
        projectId: objectId,
        numericProjectId: numericObjectId,
        catcherType: 'errors/php',
        title: 'Test event',
        backtrace: [],
        timestamp: 1769802857,
      };

      dataFilter.processEvent(event as any);

      expect(event.projectId).toBe(objectId);
      expect(event.numericProjectId).toBe(numericObjectId);
    });

    test('should not filter MongoDB ObjectId values in context and addons', async () => {
      const objectId = '507f1f77bcf86cd799439011';
      const objectIdUpper = '507F1F77BCF86CD799439011';

      const event = generateEvent({
        context: {
          projectId: objectId,
          workspaceId: objectIdUpper,
        },
        addons: {
          hawk: {
            projectId: objectId,
          },
        },
      });

      dataFilter.processEvent(event);

      expect(event.context['projectId']).toBe(objectId);
      expect(event.context['workspaceId']).toBe(objectIdUpper);
      expect(event.addons['hawk']['projectId']).toBe(objectId);
    });

</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI changed the title [WIP] Fix falsy filtered values in DataFilter Fix DataFilter false positives for UUIDs and MongoDB ObjectIds Jan 30, 2026
Copilot AI requested a review from neSpecc January 30, 2026 22:19
Copilot AI requested a review from neSpecc January 30, 2026 23:26
@codex-assistant codex-assistant bot marked this pull request as ready for review January 31, 2026 00:20
@codex-assistant
Copy link

Thanks for adding a description — the PR is now marked as Ready for Review.

@neSpecc neSpecc merged commit 4189b0a into master Jan 31, 2026
5 checks passed
@neSpecc neSpecc deleted the copilot/fix-falsy-filtered-values branch January 31, 2026 00:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

grouper(data-filter):  falsy filtered values

2 participants