Advanced strings and patterns

Unit: 14 of 19

In this lesson, you will learn advanced string functions and general string manipulation. In the previous lesson we talked about thongs. I learned the basic operations, necessary for the production of any type of web application. In this lesson, we will deepen this knowledge and add some new ones.

We will start with the interpolation of variables .

PHP is endowed with a very good feature – it is able to parse a string, so that the variables inside it are treated differently than the text itself. In the previous lesson, we approached the printf function, which makes it possible to format the template, after which the parameters of the function printf(” my variable %d”, $x);

However, we can do this in another way:

1
printf("my variable $x");  

respectively:

1
echo "my variable $x";

The result will be, in both cases: my variable 2 (two, in the example, is an arbitrary number. Of course, the value of the variable $x will be issued). PHP is not quite omnipotent when it comes to this procedure. We must take care that the variables we want to treat as variables do not lose their context.

Let’s see the following example:

1
echo "$a bcd";

The variable $a will be emitted properly, because it is separated from the rest of the string contents. But if the line looks like this:

1
echo "$abcd";

neither the variable $a nor the rest of the text will be output, because the interpreter will ask for the variable named $abcd.

If you want to implement a string so that the variable and the rest of the string are concatenated, you can use braces as separators:

1
echo "{$a}bcd";

or the concatenation operators:

1
echo $a ."bcd";

In both cases, the result will be the same: 2bcd (we remind you that the value 2 is only hypothetical. It can be any value of the variable $a).

Sometimes we will want to achieve the opposite effect. We will want the variable not to be treated as a variable, but as text (for example, we want the string text to be “$a bcd”). To this end, we will pay attention to quotation marks in PHP.

We have seen that in PHP you can create a string in two ways: with single quotes (‘ ‘) or with double quotes (” “). We usually do this if we want the quotes to be visible in our text. It doesn’t matter what kind of quotes we put the string in, as long as they are not the same as what we want to represent inside this string:

1
echo '"my" text';

This example will output the following output to the page:

“my” text.

And vice versa. The following example:

1
echo "'my' text";

will issue:

‘my’ text

But that’s why:

1
echo ""my" text";

will signal an error.

Of course, you can put quotes after the escape character and output it like this: \”

Now back to variable interpolation . Here, too, there is a difference when it comes to the types of quotation marks. We saw that double quotes allow variable parsing. On the other hand, single quotes do not parse variables or special characters, so anything you put between them will be output the same:

1
echo '$a bcd';

after which it is issued:

1
$a bcd

String length

One of the most frequently used functions on strings is the function that determines their length. This function is called strlen .

To use this function (which, by the way, is very simple), we will first look to see what a string is made of. For example, the text  my string  consists of characters/letters. Each of these characters is actually a byte. This means that our string is nothing but a string of bytes. Does this mean that if we refer to a string by its string, we will get, as members of this string, the characters of our string? Yes.

E.g:

1
2
$a = "my text";
echo $a[3];

After the following code, the character t will be written to the screen , which is the second character in our string (if zero based indexing is used).

Since we have established that string is a string of characters, the strlen function can be treated as a function that counts the members of this string:

1
echo strlen("my text");

It is important to know that when using this function, unlike other languages, it does not have the zero terminated mechanism and, when it reaches an empty character, it does not stop counting automatically, but continues it, as long as there is a value for the characters in the counted variable.

In practice, this means that the example:

1
2
$a="     ";
echo strlen($a);

which in another language (with another function) would produce the result 0, here it produces the result 5, because the string consists of five empty characters.

Here is an example where we see how we can remove spaces from a text:

1
2
3
4
5
$s = "my text";
for ($i = 0; $i < strlen ($s); $i++) {
if($s[$i] != " ")
echo $s[$i];
}

Comparing strings

The basic way to compare strings is with the help of the comparison operator:

1
echo "my text"=="my text";

This mode is good if we handle hard coded values. But, look at the following example:

1
echo "1my text"==1;

This example, like the previous one, produces a correct result, although at first glance the two parts compared have no similarity.

First, PHP reviewed the operands and saw that it was comparing the string to an int, which is impossible. For this reason, it implicitly converted the left operand to int, which reduced the value of the operand to 1 (the only number that exists in the string), and finally arrived at the comparison between 1 and 1, which produces the correct result. Perhaps at first glance it does not seem that such an error could occur, but do not forget that in programs, in such comparisons, some dynamically created values ​​usually appear (for example, $a==$b).

Comparison of strings can also be performed in another way, namely through the strcmp and strcasecmp functions . These functions receive, as parameters, two strings that are compared and return zero as a result, if the strings are identical or if the number of characters is different:

1
echo strcmp("my text","my text");

This example outputs the value 0.

The difference between strcmp and strcasecmp is that strcasecmp is not case sensitive.

1
2
echo strcmp("My text", "my text"); // rezultatul este -1
echo strcasecmp("My text", "my text"); // rezultatul este 0

The functions return numbers less than 0 if string1 is less than string2, i.e. numbers greater than 0 if string1 is greater than string2. If the strings are identical, 0 is returned.

Searching within the string

It is possible to search for a specific character or sequence of characters within the string. For this purpose, the strpos function is used .

This function accepts as parameters the initial string and the searched sequence, and as a result returns the initial position (index) of the searched sequence, respectively returns the boolean value false, this in case the searched sequence does not exist.

1
echo strpos("my text", "ext");

This example says page number 4.

The function can also accept the third parameter, the optional one, which marks the place from which the appearance of the searched character will be taken into account:

1
echo strpos("my textext", "ext", 5);

In this example, occurrences of the searched character before the fifth character of the string are not taken into account. It means that the first occurrence will not be taken into account. The result of the function will be the number 7, because the first occurrence occurs at the eighth character of the string.


Changing the string

In PHP, one part of the string can be replaced by another. This can be done with the help of the mentioned functions, but also by using the functions written exclusively for this purpose: str_replace and str_ireplace (the same function that is not case sensitive):

1
echo str_replace("my", "your", "my text");

The str_replace function accepts three (or four) parameters. The first parameter is the searched part of the string, the second is the part that will be inserted in place of the searched part and the third is the string to be intervened. The fourth parameter, the optional one, allows us to put in a variable the number of cases (cases) found of the searched string.

1
2
3
$a=0;
echo str_replace("my", "your", "my text", $a);
echo $a;

This example will return the value one because the word “my” appears only once in the string.

Thus, you can replace even more cases in a single string, using strings (array):

1
2
3
$arr1 = array("Java", "SQL", "CSS");
$arr2 = array("PHP","MySQL","HTML");
echo str_replace($arr1, $arr2, "I love Java",$a);

This example will change the text to:  I love PHP  .

Sometimes you will want to change the string at a specific position (index). For example, if you built a list based on data that you obtained sequentially. It can very easily happen that this data ends up in the following form:

1,2,3,4,5,

because, given that you created the list dynamically, you don’t know what its last member is, and you’ll be left with an extra comma (or other separator). In this case, the substr_replace function is an excellent solution:

1
2
$x = "1,2,3,4,5,";
echo substr_replace($x, "", strlen($x)-1);

The function accepts the string as a parameter, the string that will be inserted as a replacement and the position where the replacement starts. Since we don’t know the length of the string (the list can be from 1 – 5 or it can be 1 – 1000), for the position we will take the length reduced by one (so, the last element).

Separation of a part of the string

With the substr function you can isolate a part of the string from a specific position. This function accepts three parameters: the string, the start index of the isolation and, optionally, the number of characters to be isolated.

1
2
$x = "http://www.google.com";
echo substr($x,7);

This example will output the text after the seventh character. The result is www.google.com. If we do not enter the third parameter, the complete string is taken, from the initial position. If the third parameter is entered, the number of characters mentioned in the third parameter is taken:

1
2
$x = "http://www.google.com";
echo substr($x,7,3);

The result of the example is: www.

String formatting

We have already talked about string formatting through the number_format function . This function provides diversity in number formatting (especially decimal notation):

1
echo number_format(30.4000,3);

The first parameter of this function is the number itself. The second parameter is the maximum number of decimal places displayed. The output of this code is: 30,400.

This function can also accept two optional parameters (must be used together, mandatorily), which represent a character that will separate the decimals of the number and a character that will separate the thousands.

1
echo number_format(30000,3,".",",");

The output is:

30,000.000

You can also declare certain formats by identifying the locale, through the setlocale function. For this purpose, you need the name of the formatting group and the name of the locale. The following example will format all entries of the group LC_MONETARY, by locale en_US:

1
setlocale(LC_MONETARY, "en_US"); sau setlocale(LC_MONETARY, "me_JP");

 

Generic formatting

We have already familiarized ourselves with this type of formatting in the previous lessons. The functions are printf, sprintf and fprintf (the differences between these three functions are only in the output type. The first has a standard output (page or console), sprintf can return a result, and fprintf will output to a file ).

The formatting syntax is as follows:

1
printf("My number: %d", 100);

The first parameter of this function (the string My number: %d ) is what will go to the output, while the second (and all others) will be the value parameter that will replace the identifier in the string (%d). In this case, the identifier is d, which means that the output will be a decimal value, but we can also use other identifiers. For example: %b would produce a binary output:

1
printf("My number: %b", 10);

With such string formatting, you are not limited to a single identifier:

1
printf("My decimal number: %d, my binary number: %b, my floating point number: %f", 10,10,10);

To format the string, the following values ​​can be used:

  • b – binary representation;
  • c – displays at the output the character that represents the ASCII numerical value;
  • d – displays a decimal number at the output;
  • e – convert to number with exponent (printf(“%e”, 10); display output 1.000000e+1);
  • u – unmarked representation;
  • f – outputs a floating point number depending on the specified location printf (“%3.2f”, $number);
  • F – displays at the output a floating point number without specified location;
  • o – the octal representation of the number;
  • s – string (printf(“%s”, “my text”););
  • x – hexadecimal representation with lowercase letters (ffffff),
  • X – hexadecimal representation with uppercase letters (FFFFFF).

Regular expressions

When you have no way to solve a string operation in the standard way (for example, the code is too complex), you can use regular expressions.

Regular expressions are sets of rules against which a string is searched.

For example, if we want to be sure that a string is written as an email address, we could say that there are some rules for this:

  • It must have a text at the beginning, without special characters;
  • It must have the character @ after the initial text;
  • Must have text after the @ mark;
  • Then, it must have a period and a text after the period. 

This description matches all email addresses.

Since we conclude from the description that we know certain characteristics, but not what the exact content of the entire string is, this is the right place to use the regular expression.

In order for a text to be treated as a regular expression, we must put delimiters at the beginning and at the end. These delimiters can be any character, but in practice, for this purpose, the slash  / is often used .

/my regular expression/

Everything inside the delimiter represents the content of the regular expression, i.e. the comparison pattern.

The function that compares the regular expression to the string is called preg_match . This function (essentially) accepts two parameters, namely: the regular expression and the string being compared, and returns the result 1, if the text matches the expression, and 0, if it does not. Actually, this function returns a number. The following example returns the result 1.

1
echo preg_match("/mytext/","mytext");

I could make such a trivial comparison with the standard functions. Regular expressions are usually used when we cannot do the comparison in any other way. We do not do this because their use is complicated, but because their speed is not at the level of standard functions.

In order for the regular expression to have a function, in addition to the delimiter, we must introduce other elements.

Meta characters

Meta characters are parts of the regular expression that identify a certain part of the text:

  • – Represents any character in text
    echo preg_match(“/my.ext/”,”myText”); //returns 1
  • ˆ – Represents the beginning of the string
  • $ – Represents the end of the string
  • \s – Represents space
    echo preg_match(“/my\stext/”,”my text”); //returns 1
  • \d – Represents any number
    echo preg_match(“/number \d/”,”number 5″); //returns 1
  • \w – Represents any word in the string. 
    echo preg_match(“/my \w/”,”my text”); //returns 1

Several conditions can be grouped for a certain part of the text, with the help of right brackets:

1
echo preg_match("/a[bcd]e/","abe");

In the example above, one of the three alternatives (b, c or d) is allowed between the characters a and e.

Combinations of certain ranges of characters and meta characters are also allowed. The following example implies the initial letter a, then the letters b or c, followed by a number:

1
echo preg_match("/a[bc\d]/","ab2"); //return 1

Quantifiers

Quantifiers determine how many times a certain condition will be repeated in a regular expression.

  • The * character can appear once, multiple times, or not at all:

    echo preg_match(“/my s*tring/”,”my sssstring”); // returns 1
    echo preg_match(“/my s*tring/”,”my string”); // returns 1
    echo preg_match(“/my s*tring/”,”my tring”); // return 1

  • The + character can appear once or more than once:

    echo preg_match(“/my s+tring/”,”my sssstring”); // returns 1
    echo preg_match(“/my s+tring/”,”my string”); // returns 1
    echo preg_match(“/my s+tring/”,”my tring”); // return 0

  • The character ? it may not appear at all or may appear only once at that location:

    echo preg_match(“/my s?tring/”,”my tring”); // returns 1
    echo preg_match(“/my s?tring/”,”my string”); // returns 1
    echo preg_match(“/my s?tring/”,”my sstring”); // return 0

  • The character {n,m} must appear at least n times and at most m times: 

    echo preg_match(“/my s{1,3}tring/”,”my ssstring”); // returns 1
    echo preg_match(“/my s{1,3}tring/”,”my sssstring”); // return 0

Regular expressions within other regular expressions

An entire expression can be treated as a separate unit (a character). For example, if we want to use a quantifier on an expression, not just on a character.

The regular expression markup within the regular expression is the parentheses (the open parenthesis and the closed parenthesis):

1
echo preg_match("/my (ab.) string/","my abc string");  //return 1

In the example above, we said that we want our pattern to start with the word my , then it must be followed by a blank space , then three characters ab and any other character. Finally, we end the expression with the word string, preceded by an empty space.

We could have done this without round brackets, but if we want to apply a quantifier to the whole pattern (ab.), for example, if it has to be repeated once or several times (+), we could, simply, let’s change the expression:

1
echo preg_match("/my (ab.)+ string/","my abcabdabe string"); // returneaza 1

Regular expressions are a separate dimension of programming, and you don’t need to stress too much about them. Although they are very efficient, you will rarely need the regular expression pattern, which may not even exist. Thus, you may spend more time searching for a suitable template than building your own.

 

Rezultatul executării codului: <?php $s = “my text”; for ($i = 0; $i < strlen ($s); $i++) {       if($s[$i] != ” “)             echo $s[$i]; } ?>

Exercise 1

In the application, enter the following variable:

1
$string = "myMail@mail.ml";

 

Write the regular expression that will check if the value of the variable is the email address.

Solution:

1
2
3
4
5
<?php
$string = "myMail@mail.ml";
$pattern = "/^[a-zA-Z0-9]+\@[a-zA-Z0-9]+\.[a-zA-Z]{2,3}$/";
echo preg_match($pattern,$string);
?>

The preg_match function checks the overlap of the string with the defined pattern. Only the defined regular expression needs to be explained. The beginning of the expression is defined with the ^ symbol, followed by a string of allowed characters from a to z (regardless of uppercase and lowercase letters). It is also allowed to set numbers, which we define with the range between 0 and 9. This string can appear several times. Then you are asked to enter the @ character, followed by the already known character string. It is also required to enter the dot followed by the string of characters similar to the one mentioned, but without the permission to enter numbers. This last string must have 2 or 3 characters, followed by the end of the pattern. 


Exercise 2

The following variable is given:

1
$string = "http://myPage.php?id=25&cat=18&user=34";

Take all the parameters and put them in the associative string.

Solution:

1
2
3
4
5
6
7
8
9
10
11
12
13
<?php
$string = "http://myPage.php?id=25&cat=18&user=34";
$pars = explode("?",$string);
$pars = explode("&",$pars[1]);
$parsedPars=array();
for($i=0;$i<sizeof($pars);$i++)
    {
        $currentParam = explode("=",$pars[$i]);
        $parsedPars[$currentParam[0]] = $currentParam[1];
    }
print_r($parsedPars);
?>

Since the parameters are after the question mark in the given string, we call the explode function and put “?” as a separator. Now, we have a string with two elements. On the position marked with the index 1, there are the parameters that interest us. Parameters are separated from each other with the “&” sign and that’s why we perform an explode. After that, in the $pars variable I got a string with three elements. We go through the for loop and extract the keys and values. We also do this with the explode function, but this time we put the “=” sign as the separator. Finally, we just need to display results on the page:

1
print_r($parsedPars);

 

Exercise 3

The following url string is given:

$string = “http://myDomain/home/index.php?id=25&cat=18&user=34”;

Isolate only the domain with the folders and the page name (myDomain/home/index.php).

Solution:

1
2
3
4
5
6
<?php
$string = "http://myDomain/home/index.php?id=25&cat=18&user=34";
$pars = preg_replace("/http:\/\//","",$string);
$pars = preg_replace("/\?[a-zA-Z0-9=&]+/","",$pars);
print_r($pars);
?>

With this problem, another example of using the str_replace function is presented, but this time regular expressions are also used. In the first call of the function, the initial part of the string is removed, while in the second call, the final string is removed. At the end, the page shows the searched result.

Exercise 4

A function must be created that will accept a string and return the characters in reverse order (for example, the string “my string” becomes “uem lugnirts”).

Solution:

1
2
3
4
5
6
7
8
9
10
11
<?php
function str_reverse($str){
    $rez = "";
    for($i=strlen($str)-1; $i>=0; $i--){
        $rez.=$str[$i];
}
return $rez;
}
$text = "stringul meu";
echo str_reverse($text);
?>

The function that performs the task required in the text of the exercise will be called: str_reverse. Since the function must process the string passed to it, we will put a single argument: $str. When finished, the function will return the result as a string. We could assume that the variable that will represent the result of our function will initially be an empty string, and later, probably, it will be a more complex character string. Based on the above, we write the following code:

1
2
3
4
function str_reverse($str){
    $rez = "";
    return $rez;
}

PHP strings can be processed like strings. This means that PHP is able to count the characters in the string and obtain, based on the position, the corresponding character in the string. Such tags are called indexes and indexing in PHP starts from zero by default.

Now we create the $text variable and call the function:

1
2
3
4
5
6
function str_reverse($str){
    $rez = "";
    return $rez;
}
$text = "stringul meu";
echo str_reverse($text);

The string processed by the function, in our case, is: “my string”. This means that if we want to iterate through this string using a for loop, we need to determine the number of characters in the string in order to iterate through each string. We can do this with the strlen() function. However, indexing starts at zero, which means that the highest index of our string is one less than the number of characters, since we count from 1 and indexing is from 0. Therefore, we set up the following loop, which will write the character corresponding to the iteration:

1
2
3
for($i=strlen($str)-1; $i>=0; $i--){
        $rez.=$str[$i];
    }

Finally, our exercise is complete and the function is ready to use.