Lecture # 15 - Text Files (Advanced Commands)

Lecture # 15 - Text Files (Advanced Commands)

Working with Text Files. Going through Advanced Commands.

Some advanced commands used for text files are:

  • uniq: The uniq command is used to filter out adjacent, duplicate lines in a file. It reads adjacent matching lines from the input, and only outputs one copy of each line, suppressing subsequent repetitions.

    -> To display the same lines after each other for once uniq [file-name] is used.

    -> To count repeated lines uniq -c [file-name] is used.

    -> To ignore the capital and small alphabets and focus on the word only

    uniq -i [file-name] is used.

    -> To display only the repeated lines uniq -d [file-name] is used.

  • cut: The cut command is used for extracting sections from each line of files or standard input. It's particularly useful for working with text files where data is structured in columns or delimited by a specific character.

    -> To display the column of your choice with identified delimiter(: ; . ,)

    cut -d '[delimiter]' -f [column-number] is used.

    -> To specify a range of characters cut -c [range] [file-name] is used.

  • sed: The sed command, which stands for "stream editor," is a powerful text-processing tool. It operates by performing text transformations on input streams according to specified rules and commands.

    -> To search and replace a string sed 's/[strig-to-be-replaced]/[new-string]/g' [file-name] is used.

    -> To delete a line sed '[line-number]d' [file-name] is used.

    -> To display a specific line number sed -n '[line-number]p' [file-name] is used.

    -> To add text after specified line number sed '[line-number]a\[text-to-be-added]' [file-name] is used.

    -> To add text on the specified line number sed '[line-number]i\[text-to-be-added]' [file-name]

  • split: The split command is used to split a file into smaller parts. It can be particularly useful for breaking up large files into more manageable chunks for easier handling or for transferring over networks with size limitations.

    -> To split a file into equal parts split -l [number-of-lines] [file-name] is used.

    -> To specify number of files split -n [number-of-files] [file-name] is used.

  • tac: The tac command is used to display files in reverse order, line by line. This command is written as tac [file-name] .

  • tr: The tr command is used for translating (lower-case alphabets to upper-case alphabets and vice verca) or deleting characters.

    -> To translate the alphabets tr '[range-of-alphabets]' '[translation-range]' < [file-name] is used.

    -> To delete a character tr -d '[character]' < [file-name] is used.

    -> To delete all alphabets, digits, alphanumerics from a file

    tr -d '[:digit:]' < [file-name] -> To delete all digits

    tr -d '[:alpha:]' < [file-name] -> To delete all alphabets

    tr -d '[:alnum:]' < [file-name] -> To delete all alphanumerics

  • grep: The grep command is used to search for patterns or specific strings of text within files or standard input.

    -> To match pattern and print the line of the file containing the pattern

    grep "[pattern]" [file-name] is used.

    -> To count the repetition of the pattern grep -c "[pattern]" [file-name] is used.

    -> To print only matched pattern grep -o "[pattern]" [file-name] is used.

    • -> To match pattern and print the line with the line number of the file containing the pattern

      grep -n "[pattern]" [file-name] is used.

    • find: The find command is used for searching files and directories within a specified directory hierarchy. This command is written as

      find [path/to/search] -name "[file-name or directory-name]" .If the file is available a path will be returned.

      -> If you don't know the path you can use find * -name "[file-name or directory-name]" .