Unix To PowerShell – Cut
PowerShell is definitely gaining momentum in the windows scripting world but I still hear folks wanting to rely on Unix based tools to get their job done. In this series of posts I’m going to look at converting some of the more popular Unix based tools to PowerShell.
cut
The Unix “cut” command is used to extract sections from each link of input. Extraction of line segments can be done by bytes, characters, or fields separated by a delimiter. A range must be provided in each case which consists of one of N, N-M, N- (N to the end of the line), or –M (beginning of the line to M), where N and M are counted from 1 (there is no zeroth value).
For PowerShell, I’ve omitted support for bytes but the rest of the features is included. The Parse-Range function is used to parse the above range specification. It takes as input a range specifier and returns an array of indices that the range contains. Then, the In-Range function is used to determine if a given index is included in the parsed range.
The real work is done in the Do-Cut function. In there, input error conditions are checked. Then for each file supplied, lines are extracted and processed with the given input specifiers. For character ranges, each character is processed and if it’s index in the line is in the given range, it is appended to the output line. For field ranges, the line is split into tokens using the delimiter specifier (default is a TAB). Each field is processed and if it’s index is in the included range, the field is appended to the output with the given output_delimiter specifier (which defaults to the input delimiter).
The options to the Unix cut command are implemented with the following PowerShell arguments:
Unix | PowerShell | Description |
---|---|---|
FILE | -filespec | The files to process. |
-c | -characters | Output only this range of characters. |
-f | -fields | Output only these fields specified by given range. |
-d | -delimiter | Use DELIM instead of TAB for input field delimiter. |
-s | -only_delimited | Do not print lines not containing delimiters. |
--output-delimiter | -output_delimiter | Use STRING as the output deflimiter. |