Getting Frequently Used Phrases
Phrase frequencies tab provides the information about phrases and words used more than once in text. The calculation of this list is can be time consuming process, and in certain phases it can be canceled. During the long operations, "Calculate" button turns to "Abort" and remains interactive.
Phrase is any sequence of words , unterminated or terminated by stop-characters (period, exclamation, question). By word we imply any string consisting of alphanumeric characters. Textanz compares phrases regardless of punctuation marks and numbers of spaces, tabs, linefeeds, etc. Single word is considered as particular case of phrase. The result is list of phrases repeated more than once in text.
When calculation is finished, Textanz displays the results in the form of a table consisting of three columns:
- phrase - the found phrase or word
- frequency - number of found occurrences
- words - number of words in phrase
- dispersion . This number shows, how close to each other are positions of the phrase in text. Smaller dispersion points to shorter distance between the cases.
Each column header acts as button used for rows sorting . Headers have sort markers (arrows indicating ascending or descending order) and sort indexes. Sort index is a column position in the sorting process when data is being sorted by several columns sequentially. Initial sort settings are as following:
by phrase length (number of words) descending, then by number of cases (frequency) descending, then by text alphabetically ascending;
In our opinion such ordering represents results in the most appropriate form and displays important cases first. It is obvious that this order can be changed by clicking on column titles.

Calculation is completed. Using column titles to sort the results
Single click on column title switches sorting results and has two modes: ascending (marker = arrow up) or descending (marker=arrow down). When one clicks just on a single column title no sort indexes are displayed. To sort by several columns hierarchically, keep CTRL button down and click on several columns. After that manipulation you can see both sort markers and sort indexes (1, 2, 3).
It is advisable to pay special attention to dispersion column and use it for sorting as well. In fact, dispersion shows how notable is particular word or phrase in text. As a result, this parameter becomes important in stylistical analysis. For example, the same word in two neighboring sentences is more notable to reader, than the same word in two different paragraphs. Frequency can be the same for both cases , but dispersion shows the difference and points to more severe case of word overusing.
Important note : Textanz does not display shorter phrases if they are parts of longer phrases with the same frequency. For example, after the analysis of text
"In every life we have some trouble
But when you worry you make it double
Don't worry, be happy
Don't worry, be happy now"
Textanz will show following phrases and frequencies:
- Don't worry, be happy : frequency=2
- worry : frequency=3
- you : frequency = 2
Although phrases "don't worry" and "be happy" repeats 2 times , they were not included because longer phrase containing them both has the same frequency = 2. At the same time, word "worry" repeats 3 times. It is part of "Don't worry, be happy" phrase too, but also being used in other context.
Finding phrases in original text
Besides sorting, context buttons are available in each line. When particular record is focused, we see 2 buttons :
dropdown button. In Words/phrases tab, it is useful when phrase is long and does not fit to column width. Button opens the tooltip window with the full phrase in it. In Wordforms tab, this button shows the tooltip with a list of all full words for current line:

Viewing the original phrase in dropdown tooltip
magnifier. This button opens the context menu which allows to highlight occurrences of selected word/phrase/wordform in original text. First menu item highlights all positions and scrolls the first one into the visible area. Next menu items (absolute position numbers) puts cursor to these positions one by one :
Filtering phrases.
"Filter" field allows to filter phrases by any contained word or substring

Highlighting all positions of the selected phrase.
Phrase list is filtered by word "photon".
Double-clicking on any row acts in the same way as "highlight all" menu item. Previous selection is cleared automatically when new selection is performed. There is also a toolbar button to clear selection by command.
|