PHP Data Handling: Mastering Strings, Regex, and DateTime Architecture
Effective backend logic relies on precise data parsing. This module explores the nuances of temporal data—transitioning from legacy `date()` to the robust `DateTimeImmutable` architecture—and provides a rigorous examination of string manipulation, ranging from standard formatting to complex pattern matching with Regex and handling international encodings.
Date and Time
date() and time() Functions
time() returns the current Unix timestamp (seconds since 1970-01-01), and date() formats timestamps or current time into human-readable strings using format characters.
// Current timestamp $now = time(); // e.g., 1699876543 // Format current date echo date('Y-m-d'); // 2024-01-15 echo date('H:i:s'); // 14:30:45 echo date('F j, Y'); // January 15, 2024 echo date('l, F jS Y'); // Monday, January 15th 2024 // Format specific timestamp echo date('Y-m-d', $timestamp); // Common format characters // Y=year(4), m=month(01-12), d=day(01-31) // H=hour(24), i=minutes, s=seconds // l=weekday, F=month name, j=day(1-31) // Create timestamp $ts = mktime(14, 30, 0, 1, 15, 2024); // hour, min, sec, month, day, year $ts = strtotime('2024-01-15'); $ts = strtotime('+1 week'); $ts = strtotime('next Monday');
DateTime Class
DateTime provides an object-oriented approach to date/time manipulation with methods for formatting, modifying, comparing, and calculating differences between dates.
php // Create DateTime $dt = new DateTime(); // Now $dt = new DateTime('2024-01-15'); // Specific date $dt = new DateTime('2024-01-15 14:30:00'); // With time $dt = new DateTime('+1 week'); // Relative
// Format echo $dt->format('Y-m-d H:i:s');
// Modify (mutates object!) $dt->modify('+1 day'); $dt->modify('next Monday'); $dt->setDate(2024, 12, 25); $dt->setTime(14, 30, 0);
// Compare $dt1 = new DateTime('2024-01-15'); $dt2 = new DateTime('2024-02-20'); if ($dt1 < $dt2) { echo "dt1 is earlier"; }
// Difference $diff = $dt1->diff($dt2); echo $diff->days; // Total days difference echo $diff->format('%y years, %m months, %d days'); `
DateTimeImmutable Class
DateTimeImmutable behaves like DateTime but modification methods return new instances instead of mutating the original, preventing unintended side effects.
// Mutable DateTime - original is modified! $dt = new DateTime('2024-01-15'); $modified = $dt->modify('+1 day'); echo $dt->format('Y-m-d'); // 2024-01-16 (changed!)
// Immutable - original preserved $dt = new DateTimeImmutable('2024-01-15'); $modified = $dt->modify('+1 day'); echo $dt->format('Y-m-d'); // 2024-01-15 (unchanged!) echo $modified->format('Y-m-d'); // 2024-01-16
// Recommended for most use cases $start = new DateTimeImmutable('2024-01-01'); $end = $start->modify('+1 month');
// All methods return new instances $future = $dt->add(new DateInterval('P1D')); $past = $dt->sub(new DateInterval('P1W')); $changed = $dt->setTimezone(new DateTimeZone('UTC'));
DateInterval and DatePeriod
DateInterval represents a duration (days, months, years), while DatePeriod generates a series of DateTime objects between two dates at regular intervals.
// DateInterval - duration $interval = new DateInterval('P1Y2M3D'); // 1 year, 2 months, 3 days $interval = new DateInterval('PT4H5M'); // 4 hours, 5 minutes $interval = DateInterval::createFromDateString('1 month');
// Add/subtract intervals $dt = new DateTime(); $dt->add(new DateInterval('P10D')); // Add 10 days $dt->sub(new DateInterval('P1M')); // Subtract 1 month
// DatePeriod - range of dates $start = new DateTime('2024-01-01'); $end = new DateTime('2024-01-31'); $interval = new DateInterval('P1D'); // Every day
$period = new DatePeriod($start, $interval, $end); foreach ($period as $date) { echo $date->format('Y-m-d') . "\n"; }
// Interval format: P[years]Y[months]M[days]DT[hours]H[minutes]M[seconds]S // P1Y2M3DT4H5M6S = 1 year, 2 months, 3 days, 4 hours, 5 min, 6 sec
Timezones
Use DateTimeZone to handle timezone conversions properly; always store dates in UTC in databases and convert to user timezones for display.
// Create with timezone $dt = new DateTime('now', new DateTimeZone('America/New_York'));
// Convert between timezones $dt = new DateTime('2024-01-15 14:00:00', new DateTimeZone('UTC')); $dt->setTimezone(new DateTimeZone('Europe/London')); echo $dt->format('Y-m-d H:i:s T'); // 2024-01-15 14:00:00 GMT
// Set default timezone date_default_timezone_set('UTC');
// List available timezones $timezones = DateTimeZone::listIdentifiers();
// Best practice: store in UTC, display in user's timezone $utc = new DateTimeImmutable('now', new DateTimeZone('UTC')); $userTz = new DateTimeZone($_SESSION['user_timezone']); $local = $utc->setTimezone($userTz); echo $local->format('Y-m-d H:i:s');
Date Formatting and Parsing
Format dates using format() with format characters, and parse date strings using createFromFormat() for precise control over input format interpretation.
// Formatting $dt = new DateTime('2024-01-15 14:30:45'); echo $dt->format('Y-m-d'); // 2024-01-15 echo $dt->format('m/d/Y'); // 01/15/2024 echo $dt->format('F j, Y g:i A'); // January 15, 2024 2:30 PM echo $dt->format('c'); // ISO 8601: 2024-01-15T14:30:45+00:00 echo $dt->format('U'); // Unix timestamp
// Parsing with specific format $dt = DateTime::createFromFormat('m/d/Y', '01/15/2024'); $dt = DateTime::createFromFormat('d-M-Y', '15-Jan-2024'); $dt = DateTime::createFromFormat('Y-m-d H:i:s', '2024-01-15 14:30:45');
// Check parsing errors $dt = DateTime::createFromFormat('Y-m-d', 'invalid'); $errors = DateTime::getLastErrors(); if ($errors['error_count'] > 0) { echo "Parse failed"; }
Carbon Library
Carbon extends DateTimeImmutable with a fluent API, localization, human-readable diff outputs, and many convenience methods for common date operations.
`php use Carbon\Carbon;
// Create instances $now = Carbon::now(); $today = Carbon::today(); $tomorrow = Carbon::tomorrow(); $parsed = Carbon::parse('2024-01-15'); // Fluent modifiers $date = Carbon::now()->addDays(5)->subMonth()->startOfDay();
// Human-readable diffs echo Carbon::parse('2024-01-15')->diffForHumans(); // "2 months ago" echo $date->diffInDays($otherDate);
// Comparison helpers $date->isToday(); $date->isPast(); $date->isFuture(); $date->isWeekend(); $date->isBirthday(Carbon::parse('1990-01-15'));
// Localization Carbon::setLocale('es'); echo $date->translatedFormat('l, F j'); // lunes, enero 15
// Install: composer require nesbot/carbon `
This covers the fundamental PHP concepts. Would you like me to elaborate on any specific topic?
Strings
String Creation and Manipulation
PHP strings can be created using single quotes (literal), double quotes (variable interpolation), heredoc (multi-line with interpolation), or nowdoc (multi-line literal). Strings are binary-safe and can be accessed character-by-character using array syntax.
// Single quotes - literal $str1 = 'Hello $name'; // Output: Hello $name // Double quotes - interpolation $name = "World"; $str2 = "Hello $name"; // Output: Hello World $str3 = "Hello {$name}!"; // Complex syntax // Heredoc - multi-line with interpolation $html = <<<HTML <div> <p>Hello, $name!</p> </div> HTML; // Nowdoc - multi-line literal $code = <<<'CODE' $variable = "not interpolated"; CODE; // String access $str = "Hello"; echo $str[0]; // Output: H echo $str[-1]; // Output: o (negative index) // Concatenation $greeting = "Hello" . " " . "World"; $greeting .= "!"; // Append
String Functions (strlen, strpos, substr, etc.)
PHP offers extensive string functions for measuring length, finding substrings, extracting portions, transforming case, and more. These functions are byte-based; use mb_* equivalents for multibyte/Unicode strings.
$str = "Hello, World!"; // Length echo strlen($str); // 13 // Find position (0-indexed, false if not found) echo strpos($str, "World"); // 7 echo strpos($str, "xyz"); // false // Extract substring echo substr($str, 0, 5); // "Hello" echo substr($str, -6); // "World!" // Case transformation echo strtoupper($str); // "HELLO, WORLD!" echo strtolower($str); // "hello, world!" echo ucfirst("hello"); // "Hello" echo ucwords("hello world"); // "Hello World" // Replace echo str_replace("World", "PHP", $str); // "Hello, PHP!" // Trim whitespace echo trim(" hello "); // "hello" echo ltrim(" hello"); // "hello" echo rtrim("hello "); // "hello" // Split and join $parts = explode(",", "a,b,c"); // ["a", "b", "c"] $joined = implode("-", $parts); // "a-b-c" // Reverse echo strrev("Hello"); // "olleH"
┌─────────────────────────────────────────────────────┐
│ Common String Functions │
├──────────────────┬──────────────────────────────────┤
│ strlen($s) │ Get string length │
│ strpos($s, $n) │ Find first occurrence │
│ strrpos($s, $n) │ Find last occurrence │
│ substr($s, $i) │ Extract substring │
│ str_replace() │ Replace occurrences │
│ strtoupper($s) │ Convert to uppercase │
│ strtolower($s) │ Convert to lowercase │
│ trim($s) │ Strip whitespace │
│ explode($d, $s) │ Split string into array │
│ implode($g, $a) │ Join array into string │
└──────────────────┴──────────────────────────────────┘
String Formatting (printf, sprintf)
printf outputs a formatted string directly, while sprintf returns it for later use. Format specifiers like %s (string), %d (integer), %f (float) control how values are inserted and formatted.
$name = "John"; $age = 30; $price = 49.99; // printf - prints directly printf("Name: %s, Age: %d\n", $name, $age); // Output: Name: John, Age: 30 // sprintf - returns string $msg = sprintf("User: %s is %d years old", $name, $age); // Format specifiers printf("%05d", 42); // "00042" - zero-padded printf("%.2f", $price); // "49.99" - 2 decimal places printf("%10s", "Hi"); // " Hi" - right-padded printf("%-10s", "Hi"); // "Hi " - left-padded printf("%'#10s", "Hi"); // "########Hi" - custom padding // Argument swapping printf("%2\$s is %1\$d", 25, "Age"); // "Age is 25" // Number formatting $number = 1234567.891; echo number_format($number, 2, '.', ','); // "1,234,567.89" // Common format specifiers // %s = string, %d = integer, %f = float // %b = binary, %x = hex, %o = octal // %e = scientific notation, %% = literal %
Regular Expressions (PCRE)
PHP uses Perl-Compatible Regular Expressions (PCRE) for pattern matching, providing powerful text search and manipulation capabilities. Patterns are enclosed in delimiters (typically /), with optional modifiers like i (case-insensitive) and m (multiline).
// Pattern structure: /pattern/modifiers $pattern = '/hello/i'; // Case-insensitive // Common patterns '/^\w+$/' // Word characters only '/\d{3}-\d{4}/' // Phone format XXX-XXXX '/[a-zA-Z]+/' // Letters only '/^.+@.+\..+$/' // Simple email // Modifiers // i = case-insensitive // m = multiline (^ and $ match line boundaries) // s = dotall (. matches newlines) // x = extended (allows whitespace/comments) // u = UTF-8 mode // Character classes // \d = digit, \D = non-digit // \w = word char, \W = non-word // \s = whitespace, \S = non-whitespace // . = any char (except newline) // Quantifiers // * = 0 or more // + = 1 or more // ? = 0 or 1 // {n} = exactly n // {n,m} = between n and m
┌─────────────────────────────────────────┐
│ PCRE Pattern Structure │
├─────────────────────────────────────────┤
│ /pattern/modifiers │
│ │
│ ^ Start of string │
│ $ End of string │
│ . Any character │
│ [] Character class │
│ () Capturing group │
│ | Alternation (OR) │
│ \ Escape special char │
└─────────────────────────────────────────┘
preg_match, preg_replace, preg_split
preg_match tests if a pattern matches and captures groups, preg_replace performs pattern-based substitution, and preg_split divides a string using a regex delimiter. Use preg_match_all to find all matches.
$text = "Contact: john@example.com or jane@test.org"; // preg_match - returns 1 (match), 0 (no match), or false (error) if (preg_match('/\w+@\w+\.\w+/', $text, $matches)) { echo $matches[0]; // "john@example.com" (first match) } // With capturing groups preg_match('/(\w+)@(\w+)\.(\w+)/', $text, $matches); // $matches = ["john@example.com", "john", "example", "com"] // preg_match_all - find all matches preg_match_all('/\w+@\w+\.\w+/', $text, $matches); // $matches[0] = ["john@example.com", "jane@test.org"] // preg_replace - pattern substitution $result = preg_replace('/\d+/', '#', "abc123def456"); // $result = "abc#def#" // With backreferences $result = preg_replace('/(\w+)@(\w+)/', '$2:$1', "john@example"); // $result = "example:john" // preg_split - split by pattern $parts = preg_split('/[\s,]+/', "one, two three"); // $parts = ["one", "two", "three"] // Limit and flags $parts = preg_split('/(\s+)/', "a b c", -1, PREG_SPLIT_DELIM_CAPTURE); // Includes delimiters in result
Multibyte String Functions (mbstring)
The mbstring extension provides multibyte-safe string functions for handling Unicode and other multi-byte encodings (UTF-8, Shift-JIS, etc.). Always use mb_* functions when working with international characters to avoid corruption.
$str = "こんにちは World"; // Japanese + English // strlen vs mb_strlen echo strlen($str); // 20 (bytes) echo mb_strlen($str); // 12 (characters) // Substring echo mb_substr($str, 0, 5); // "こんにちは" // Case conversion $text = "HÉLLO"; echo mb_strtolower($text); // "héllo" echo strtolower($text); // Potentially broken! // Position echo mb_strpos("日本語", "本"); // 1 // Encoding conversion $utf8 = mb_convert_encoding($str, "UTF-8", "SJIS"); // Set internal encoding mb_internal_encoding("UTF-8"); // Common mbstring functions // mb_strlen() - String length // mb_substr() - Get substring // mb_strpos() - Find position // mb_strtolower() - Lowercase // mb_strtoupper() - Uppercase // mb_convert_encoding() - Convert encoding // mb_detect_encoding() - Detect encoding // Configure default encoding in php.ini: // mbstring.internal_encoding = UTF-8
┌───────────────────────────────────────────────────┐ │ strlen() vs mb_strlen() with "日本語" │ ├───────────────────────────────────────────────────┤ │ strlen(): 9 (counts bytes: 3 bytes × 3 chars)│ │ mb_strlen(): 3 (counts actual characters) │ └───────────────────────────────────────────────────┘