Like any programming language, R makes it easy to compile lists of sorted and ordered data. To find substrings, you can use the grep() function, which takes two essential arguments:
pattern: The pattern you want to find.
x: The character vector you want to search.
Suppose you want to find all the states that contain the pattern New. Do it like this:
> grep(“New”, state.name) [1] 29 30 31 32
The result of grep() is a numeric vector with the positions of each of the components that contain the matching pattern. In other words, the 29th component of state.name contains the word New.
> state.name[29] New Hampshire
Phew, that worked! But typing in the position of each matching text is going to be a lot of work. Fortunately, you can use the results of grep() directly to subset the original vector. You can do this by adding the argument value = TRUE. Try this:
> grep(“New”, state.name, value = TRUE) [1] “New Hampshire” “New Jersey” [3] “New Mexico” “New York”
The grep() function is case sensitive — it only matches text in the same case (uppercase or lowercase) as your search pattern. If you search for the pattern “new” in lowercase, your search results are empty:
> grep(“new”, state.name, value = TRUE) character(0)