Customizing Fake Data with Regular Expressions
Regular expressions provide powerful pattern-matching capabilities that can transform how you generate custom fake data. Whether you need specific format validation, industry-standard identifiers, or complex business rule compliance, regex-driven data generation ensures your test data matches exact requirements.
Understanding Regex-Driven Data Generation
Traditional fake data libraries provide general-purpose data, but many applications require data that follows specific patterns, formats, or business rules. Regular expressions allow you to define precise patterns and generate data that conforms to your exact specifications.
Benefits of Regex-Based Generation
Precision: Generate data that exactly matches your validation rules Compliance: Ensure data meets industry standards and formats Flexibility: Create complex patterns that adapt to business logic Validation: Test edge cases and boundary conditions systematically Consistency: Maintain format consistency across large datasets
Basic Regex Pattern Generation
1. Simple Pattern Matching
Start with basic patterns for common data types:
const RandExp = require('randexp');// Generate phone numbers in specific format
const phonePattern = /\(\d{3}\) \d{3}-\d{4}/;
const phoneGenerator = new RandExp(phonePattern);
console.log(phoneGenerator.gen()); // "(555) 123-4567"
console.log(phoneGenerator.gen()); // "(892) 456-7890"
// Generate product codes
const productCodePattern = /[A-Z]{2}\d{4}-[A-Z]{3}/;
const productGenerator = new RandExp(productCodePattern);
console.log(productGenerator.gen()); // "AB1234-XYZ"
console.log(productGenerator.gen()); // "CD5678-QRS"
// Generate license plates
const licensePlatePattern = /[A-Z]{3}-\d{3}/;
const plateGenerator = new RandExp(licensePlatePattern);
console.log(plateGenerator.gen()); // "ABC-123"
console.log(plateGenerator.gen()); // "XYZ-789"
2. Advanced Pattern Techniques
Create more sophisticated patterns for complex requirements:
class CustomPatternGenerator {
constructor() {
this.patterns = new Map();
}
addPattern(name, regex, options = {}) {
this.patterns.set(name, {
regex: new RandExp(regex),
options: options
});
}
generate(patternName, count = 1) {
const pattern = this.patterns.get(patternName);
if (!pattern) throw new Error(Pattern '${patternName}' not found);
const results = [];
for (let i = 0; i < count; i++) {
let generated = pattern.regex.gen();
// Apply post-processing if specified
if (pattern.options.transform) {
generated = pattern.options.transform(generated);
}
results.push(generated);
}
return count === 1 ? results[0] : results;
}
}// Usage examples
const generator = new CustomPatternGenerator();
// Social Security Numbers (US format)
generator.addPattern('ssn', /\d{3}-\d{2}-\d{4}/, {
transform: (value) => value.replace(/^000|00$|0000$/, '123') // Avoid invalid SSN patterns
});
// Employee IDs with department prefix
generator.addPattern('employeeId', /(ENG|SAL|MKT|HR)\d{5}/, {
transform: (value) => value.toUpperCase()
});
// Custom email patterns for testing
generator.addPattern('testEmail', /[a-z]{3,8}\.[a-z]{3,8}@(test|dev|staging)\.(com|org|net)/);
// International phone numbers
generator.addPattern('intlPhone', /\+\d{1,3}-\d{3}-\d{3}-\d{4}/);
// Generate samples
console.log(generator.generate('ssn', 5));
console.log(generator.generate('employeeId', 3));
console.log(generator.generate('testEmail', 2));
Create custom data patterns instantly with our pattern generator.
Industry-Specific Pattern Generation
1. Financial Data Patterns
Generate industry-compliant financial identifiers:
class FinancialPatternGenerator {
constructor() {
this.patterns = {
// Credit card patterns (test numbers only)
visa: /4\d{3}-\d{4}-\d{4}-\d{4}/,
mastercard: /5[1-5]\d{2}-\d{4}-\d{4}-\d{4}/,
amex: /3[47]\d{2}-\d{6}-\d{5}/,
// Bank routing numbers (ABA format)
routingNumber: /[0-9]{9}/,
// Account numbers
accountNumber: /[0-9]{8,12}/,
// IBAN pattern (simplified)
iban: /[A-Z]{2}\d{2}[A-Z0-9]{4}\d{7}([A-Z0-9]?){0,16}/,
// SWIFT codes
swift: /[A-Z]{6}[A-Z0-9]{2}([A-Z0-9]{3})?/
};
}
generateCreditCard(type = 'visa') {
const pattern = this.patterns[type];
if (!pattern) throw new Error(Unknown card type: ${type});
const generator = new RandExp(pattern);
let cardNumber = generator.gen();
// Ensure test card numbers don't accidentally match real ones
cardNumber = this.makeTestCardNumber(cardNumber);
return {
number: cardNumber,
type: type,
cvv: this.generateCVV(type),
expiryDate: this.generateExpiryDate(),
isTestCard: true
};
}
makeTestCardNumber(cardNumber) {
// Modify to ensure it's clearly a test number
return cardNumber.replace(/^\d{4}/, '4000'); // Start with test prefix
}
generateCVV(type) {
const length = type === 'amex' ? 4 : 3;
return new RandExp(\\d{${length}}).gen();
}
generateExpiryDate() {
const futureDate = new Date();
futureDate.setFullYear(futureDate.getFullYear() + Math.floor(Math.random() * 5) + 1);
const month = String(futureDate.getMonth() + 1).padStart(2, '0');
const year = String(futureDate.getFullYear()).slice(-2);
return ${month}/${year};
}
generateBankAccount() {
return {
routingNumber: new RandExp(this.patterns.routingNumber).gen(),
accountNumber: new RandExp(this.patterns.accountNumber).gen(),
accountType: faker.helpers.arrayElement(['checking', 'savings']),
isTestAccount: true
};
}
generateIBAN(countryCode = 'GB') {
// Simplified IBAN generation for testing
const checkDigits = String(Math.floor(Math.random() * 100)).padStart(2, '0');
const bankCode = new RandExp(/[A-Z]{4}/).gen();
const accountNumber = new RandExp(/\d{8}/).gen();
return ${countryCode}${checkDigits}${bankCode}${accountNumber};
}
}// Usage
const finGenerator = new FinancialPatternGenerator();
console.log(finGenerator.generateCreditCard('visa'));
console.log(finGenerator.generateCreditCard('mastercard'));
console.log(finGenerator.generateBankAccount());
console.log(finGenerator.generateIBAN('US'));
2. Healthcare Data Patterns
Generate HIPAA-compliant healthcare identifiers:
class HealthcarePatternGenerator {
constructor() {
this.patterns = {
// National Provider Identifier (NPI)
npi: /[1-9]\d{9}/,
// Medical Record Numbers
mrn: /(MRN|MR)-\d{6,10}/,
// Prescription numbers
prescriptionNumber: /RX\d{7,10}/,
// Insurance member IDs
insuranceMemberId: /[A-Z]{3}\d{7,9}/,
// Lab result IDs
labResultId: /LAB-\d{4}-\d{6}/,
// Appointment IDs
appointmentId: /APT\d{8}/
};
}
generatePatientData() {
return {
// Synthetic identifiers only
patientId: PAT-${new RandExp(/\d{8}/).gen()},
mrn: new RandExp(this.patterns.mrn).gen(),
// Age groups instead of specific dates
ageGroup: faker.helpers.arrayElement([
'0-17', '18-34', '35-54', '55-74', '75+'
]),
// General geographic region
region: faker.helpers.arrayElement([
'Northeast', 'Southeast', 'Midwest', 'Southwest', 'West'
]),
// Synthetic medical data
primaryProvider: {
npi: new RandExp(this.patterns.npi).gen(),
name: Dr. ${faker.person.lastName()},
specialty: faker.helpers.arrayElement([
'Family Medicine', 'Internal Medicine', 'Cardiology', 'Neurology'
])
},
insurance: {
memberId: new RandExp(this.patterns.insuranceMemberId).gen(),
groupNumber: new RandExp(/[A-Z0-9]{6,10}/).gen(),
provider: faker.helpers.arrayElement([
'TestCare Insurance', 'Demo Health Plan', 'Sample Medical Group'
])
},
// Synthetic flags
syntheticRecord: true,
hipaaCompliant: true
};
}
generatePrescription() {
return {
prescriptionNumber: new RandExp(this.patterns.prescriptionNumber).gen(),
medication: faker.helpers.arrayElement([
'Generic Medication A', 'Test Drug B', 'Sample Prescription C'
]),
dosage: ${faker.number.int({ min: 5, max: 500 })}mg,
frequency: faker.helpers.arrayElement([
'Once daily', 'Twice daily', 'Three times daily', 'As needed'
]),
prescribedDate: faker.date.past({ years: 1 }),
prescribingProvider: new RandExp(this.patterns.npi).gen(),
isTestPrescription: true
};
}
}Generate healthcare-compliant test data with our medical data generator.
Business Rule Implementation
1. Conditional Pattern Generation
Generate data that follows complex business rules:
class BusinessRuleGenerator {
constructor() {
this.rules = new Map();
}
addRule(name, conditions, patterns) {
this.rules.set(name, { conditions, patterns });
}
generateByRule(ruleName, context = {}) {
const rule = this.rules.get(ruleName);
if (!rule) throw new Error(Rule '${ruleName}' not found);
// Evaluate conditions to determine which pattern to use
for (const condition of rule.conditions) {
if (condition.when(context)) {
const pattern = condition.pattern;
const generator = new RandExp(pattern);
let result = generator.gen();
// Apply any transformations
if (condition.transform) {
result = condition.transform(result, context);
}
return result;
}
}
// Default pattern if no conditions match
const defaultPattern = rule.patterns.default || /[A-Z0-9]{8}/;
return new RandExp(defaultPattern).gen();
}
}// Example: Employee ID generation based on department and seniority
const businessRules = new BusinessRuleGenerator();
businessRules.addRule('employeeId', [
{
when: (ctx) => ctx.department === 'Engineering' && ctx.seniority === 'Senior',
pattern: /SE\d{4}/, // Senior Engineer
transform: (value, ctx) => ${value}-${ctx.location || 'HQ'}
},
{
when: (ctx) => ctx.department === 'Engineering',
pattern: /EN\d{4}/, // Engineer
},
{
when: (ctx) => ctx.department === 'Sales' && ctx.seniority === 'Senior',
pattern: /SS\d{4}/, // Senior Sales
},
{
when: (ctx) => ctx.department === 'Sales',
pattern: /SL\d{4}/, // Sales
},
{
when: (ctx) => ctx.seniority === 'Manager',
pattern: /MG\d{4}/, // Manager
}
]);
// Generate employee IDs based on context
console.log(businessRules.generateByRule('employeeId', {
department: 'Engineering',
seniority: 'Senior',
location: 'NYC'
})); // "SE1234-NYC"
console.log(businessRules.generateByRule('employeeId', {
department: 'Sales',
seniority: 'Junior'
})); // "SL5678"
2. Cross-Field Validation Patterns
Generate data where multiple fields must be consistent:
class CrossFieldPatternGenerator {
constructor() {
this.fieldRelationships = new Map();
}
addRelationship(primaryField, dependentField, relationship) {
if (!this.fieldRelationships.has(primaryField)) {
this.fieldRelationships.set(primaryField, []);
}
this.fieldRelationships.get(primaryField).push({
field: dependentField,
relationship: relationship
});
}
generateRelatedFields(primaryField, primaryValue) {
const relationships = this.fieldRelationships.get(primaryField) || [];
const result = { [primaryField]: primaryValue };
for (const rel of relationships) {
result[rel.field] = rel.relationship(primaryValue);
}
return result;
}
}// Example: Generate consistent address data
const addressGenerator = new CrossFieldPatternGenerator();
// Define relationships between address fields
addressGenerator.addRelationship('zipCode', 'state', (zipCode) => {
// Simplified: determine state from zip code pattern
const zip = parseInt(zipCode);
if (zip >= 10000 && zip <= 14999) return 'NY';
if (zip >= 90000 && zip <= 96199) return 'CA';
if (zip >= 60000 && zip <= 60999) return 'IL';
return 'XX'; // Default for test data
});
addressGenerator.addRelationship('zipCode', 'city', (zipCode) => {
// Generate city name based on zip code
return TestCity${zipCode.slice(-3)};
});
addressGenerator.addRelationship('state', 'country', (state) => {
return 'US'; // All states map to US
});
// Generate consistent address
const zipCode = new RandExp(/\d{5}/).gen();
const address = addressGenerator.generateRelatedFields('zipCode', zipCode);
console.log(address);
// {
// zipCode: "10001",
// state: "NY",
// city: "TestCity001",
// country: "US"
// }
Data Validation and Testing
1. Pattern Validation Testing
Test your regex patterns thoroughly:
class PatternValidator {
static validatePattern(pattern, testCases, expectedMatches = true) {
const results = {
pattern: pattern.toString(),
passed: 0,
failed: 0,
failures: []
};
for (const testCase of testCases) {
const matches = pattern.test(testCase);
if (matches === expectedMatches) {
results.passed++;
} else {
results.failed++;
results.failures.push({
input: testCase,
expected: expectedMatches,
actual: matches
});
}
}
return results;
}
static generateAndValidate(pattern, count = 100) {
const generator = new RandExp(pattern);
const generated = [];
const validationErrors = [];
for (let i = 0; i < count; i++) {
const value = generator.gen();
generated.push(value);
// Validate that generated value matches the pattern
if (!pattern.test(value)) {
validationErrors.push({
value: value,
error: 'Generated value does not match pattern'
});
}
}
return {
generated: generated,
errors: validationErrors,
successRate: ((count - validationErrors.length) / count * 100).toFixed(2) + '%'
};
}
}// Test phone number pattern
const phonePattern = /^\(\d{3}\) \d{3}-\d{4}$/;
const validPhones = [
'(555) 123-4567',
'(800) 555-1234',
'(123) 456-7890'
];
const invalidPhones = [
'555-123-4567', // Wrong format
'(555) 1234567', // Missing dash
'(55) 123-4567' // Wrong digit count
];
console.log('Valid phone tests:');
console.log(PatternValidator.validatePattern(phonePattern, validPhones, true));
console.log('\nInvalid phone tests:');
console.log(PatternValidator.validatePattern(phonePattern, invalidPhones, false));
console.log('\nGenerated phone validation:');
console.log(PatternValidator.generateAndValidate(phonePattern, 50));
2. Edge Case Pattern Testing
Generate edge cases for thorough testing:
class EdgeCaseGenerator {
static generateEdgeCases(basePattern) {
const edgeCases = [];
// Generate minimum length cases
const minPattern = this.createMinimumPattern(basePattern);
if (minPattern) {
edgeCases.push({
type: 'minimum_length',
pattern: minPattern,
value: new RandExp(minPattern).gen()
});
}
// Generate maximum length cases
const maxPattern = this.createMaximumPattern(basePattern);
if (maxPattern) {
edgeCases.push({
type: 'maximum_length',
pattern: maxPattern,
value: new RandExp(maxPattern).gen()
});
}
// Generate boundary cases
const boundaryPatterns = this.createBoundaryPatterns(basePattern);
for (const boundaryPattern of boundaryPatterns) {
edgeCases.push({
type: 'boundary_case',
pattern: boundaryPattern,
value: new RandExp(boundaryPattern).gen()
});
}
return edgeCases;
}
static createMinimumPattern(pattern) {
// Convert quantifiers to minimum values
let minPattern = pattern.toString();
minPattern = minPattern.replace(/\{(\d+),\d*\}/g, '{$1}'); // {2,5} -> {2}
minPattern = minPattern.replace(/\+/g, ''); // + -> single occurrence
minPattern = minPattern.replace(/\/g, ''); // -> no occurrence
minPattern = minPattern.replace(/\?/g, ''); // ? -> no occurrence
return new RegExp(minPattern.slice(1, -1)); // Remove /.../ wrapper
}
static createMaximumPattern(pattern) {
// Convert quantifiers to maximum reasonable values
let maxPattern = pattern.toString();
maxPattern = maxPattern.replace(/\{\d*,(\d+)\}/g, '{$1}'); // {2,5} -> {5}
maxPattern = maxPattern.replace(/\+/g, '{10}'); // + -> reasonable max
maxPattern = maxPattern.replace(/\/g, '{10}'); // -> reasonable max
maxPattern = maxPattern.replace(/\?/g, ''); // ? -> single occurrence
return new RegExp(maxPattern.slice(1, -1));
}
static createBoundaryPatterns(pattern) {
// Create patterns for testing character class boundaries
const boundaries = [];
// Number boundaries
if (pattern.toString().includes('\\d')) {
boundaries.push(/0+/); // All zeros
boundaries.push(/9+/); // All nines
}
// Letter boundaries
if (pattern.toString().includes('[A-Z]')) {
boundaries.push(/A+/); // All A's
boundaries.push(/Z+/); // All Z's
}
return boundaries;
}
}// Example usage
const emailPattern = /[a-z]{3,8}\.[a-z]{3,8}@[a-z]{3,10}\.(com|org|net)/;
const edgeCases = EdgeCaseGenerator.generateEdgeCases(emailPattern);
console.log('Edge cases for email pattern:');
edgeCases.forEach(edge => {
console.log(${edge.type}: ${edge.value});
});
Performance Optimization
1. Pattern Compilation Caching
Optimize regex performance for high-volume generation:
class OptimizedPatternGenerator {
constructor() {
this.compiledPatterns = new Map();
this.generationStats = new Map();
}
compilePattern(name, pattern, options = {}) {
const compiled = {
regex: new RandExp(pattern),
originalPattern: pattern,
options: options,
compiledAt: Date.now()
};
// Apply optimization options
if (options.maxLength) {
compiled.regex.max = options.maxLength;
}
this.compiledPatterns.set(name, compiled);
this.generationStats.set(name, { generated: 0, totalTime: 0 });
return compiled;
}
generate(patternName, count = 1) {
const pattern = this.compiledPatterns.get(patternName);
if (!pattern) {
throw new Error(Pattern '${patternName}' not compiled);
}
const startTime = Date.now();
const results = [];
for (let i = 0; i < count; i++) {
results.push(pattern.regex.gen());
}
// Update statistics
const stats = this.generationStats.get(patternName);
stats.generated += count;
stats.totalTime += Date.now() - startTime;
return count === 1 ? results[0] : results;
}
getPerformanceStats(patternName) {
const stats = this.generationStats.get(patternName);
if (!stats) return null;
return {
totalGenerated: stats.generated,
totalTimeMs: stats.totalTime,
averageTimeMs: stats.totalTime / stats.generated,
generationsPerSecond: Math.round(stats.generated / stats.totalTime * 1000)
};
}
optimizePattern(patternName) {
const pattern = this.compiledPatterns.get(patternName);
if (!pattern) return false;
// Apply common optimizations
let optimizedPattern = pattern.originalPattern.toString();
// Replace expensive quantifiers with fixed ranges
optimizedPattern = optimizedPattern.replace(/\+/g, '{1,5}');
optimizedPattern = optimizedPattern.replace(/\*/g, '{0,5}');
// Create optimized version
const optimized = new RegExp(optimizedPattern.slice(1, -1));
pattern.regex = new RandExp(optimized);
return true;
}
}// Usage example
const optimizer = new OptimizedPatternGenerator();
// Compile frequently used patterns
optimizer.compilePattern('userId', /USER_\d{6}_[A-Z]{3}/, { maxLength: 15 });
optimizer.compilePattern('sessionId', /[a-f0-9]{32}/, { maxLength: 32 });
// Generate data
console.time('generation');
const userIds = optimizer.generate('userId', 10000);
const sessionIds = optimizer.generate('sessionId', 10000);
console.timeEnd('generation');
// Check performance
console.log('User ID generation stats:', optimizer.getPerformanceStats('userId'));
console.log('Session ID generation stats:', optimizer.getPerformanceStats('sessionId'));
Conclusion
Regular expressions unlock powerful customization capabilities for fake data generation, enabling you to create data that precisely matches your application's requirements:
Key Benefits:
Best Practices:
Common Use Cases:
Ready to create custom data patterns for your specific needs? Start building with our advanced pattern generator that supports regex-driven data generation.
Related Articles:
Need help implementing complex regex patterns for your specific business requirements? Contact our pattern experts for specialized assistance.