A script that would parse .txt files into smaller files

I have 100 .txt files with several doctor’s notes in each file.

Each note ends with a line that says: "signed by Dr. Name Last name text text text"

Some of the text in the file is highlighted by having a --<({ text text text })>-- annotation.

This annotated text signifies Positive status for a disease.

I need a script that will take a set of 100 .txt files and:

1. Parse each file into smaller files each with one note only, by identifying the lines with

“signed by Dr. Name Last name text text text.”

2. Then, create a list with the file-names, and disease-status (i.e., if there is highlighted text in the file the file is ‘positive’ else it is ‘negative’).

3. For each note that is positive create now two files:

a. A file with only the highlighted text ( --<({ text text text })>-- )

b. A file with the rest of the text.

The end product should be:

1. A set of files each contacting a single doctor’s note.

2. A table with the file names and the status of disease (1=positive, 0=negative)

3. A set of files each containing only highlighted text.

4. A set of files each containing only non-highlighted text.

* Highlight markings (i.e., --<({ })>-- ) should be removed.

* Any line beginning with “signed by” should be removed

