Productivity of Anglicism Bases in Hyphenated German Compounds

Steven Coats (English Philology, University of Oulu, Finland,
Adrien Barbaresi (Berlin-Brandenburg Academy of Sciences, Germany,

7th CMC-Corpora Conference
Paris Seine University, Cergy-Pontoise, 10 September 2019


  1. Anglicisms and compounds in German

  2. Morphological productivity

  3. Data and methods

  4. Results

  5. Caveats, summary, future outlook

Slides for the presentation are on my homepage at

Anglicisms and compunds in German

  • Anglicism: "A word or idiom that is recognizably English in its form (spelling, pronunciation, morphology, or at least one of the three), but is accepted as an item in the vocabulary of the receptor language"
    (Görlach 2003: 1)

Older anglicisms: Dschungel 'jungle', Streik 'strike'

Newer anglicisms: Discounter, 'discount store', Show 'show' (entertainment event)

  • Increase in use of anglicisms since the 19th century

    (Burmasova, 2010; Eisenberg, 2011, 2013; Onysko, 2007)

  • This study: Anglicisms as elements in hyphenated German compounds

Compounds in German

  • Default setting: Constituent elements are joined without hyphens

die Freizeit 'the free time' + das Angebot 'the offer' = das Freizeitangebot 'the range of free-time activities'

  • Exogenous constituents: Same as native German elements

Sport 'sports' + Schau 'view' = die Sportschau 'the sports show' (name of a television show)

Urlaub 'holiday' + Feeling 'feeling' = das Urlaubsfeeling 'feeling of being on holiday'

  • Hyphenation is recommended for compounds containing proper nouns, abbreviated constituents or verb phrases (Duden, 2006; Fleischer & Barz, 2012).

Merkel-Regierung 'Merkel government', Weimar-Zeit 'Weimar era', Wtf-Momente 'Wtf moments', aus-dem-Fenster-gucken 'looking out the window'

  • Hyphenation is also used to disambiguate possible readings

Wach-Stube 'watch room', Wachs-Tube 'Wax tube', Ski-Freifahrttag 'Free skiing day', Skifrei-Fahrttag 'Skiless driving day' (Fleischer & Barz, 2012, p. 193)

Hyphenated and unhyphenated forms

In [2]:
type freq type_concat freq_concat
0 Euro-Zone 3606 Eurozone 16196
1 QR-Code 1809 QRcode 14
2 App-Store 906 Appstore 178
3 Fast-Food 741 Fastfood 17
4 Blog-Post 738 Blogpost 12
5 Open-Source 561 Opensource 26
6 Co-Pilot 543 Copilot 24
7 Web-Site 509 Website 29109
8 Europa-Park 460 Europapark 15
9 Geo-Engineering 451 Geoengineering 31
10 Last-Minute 411 Lastminute 22
11 Small-Talk 408 Smalltalk 1097
12 Live-Blog 397 Liveblog 14
13 Talk-Show 329 Talkshow 11
14 Think-Tank 252 Thinktank 11
15 High-School 250 Highschool 154
16 Web-Shop 203 Webshop 21
17 Web-Sites 192 Websites 44
18 Euro-Crash 189 Eurocrash 25
19 Euro-Land 187 Euroland 664
20 Web-Design 176 Webdesign 30
21 Counter-Strike 176 Counterstrike 23
22 Rock-Band 166 Rockband 12
23 Coffee-Shop 163 Coffeeshop 27
24 High-Speed 160 Highspeed 29
25 Johannes-Passion 154 Johannespassion 11
26 Junk-Food 149 Junkfood 17
27 Euro-Cent 148 Eurocent 27
28 Task-Force 144 Taskforce 198
29 Re-Design 143 Redesign 368
... ... ... ... ...
3046 Vapor-Ware 1 Vaporware 39
3047 Up-Lift 1 Uplift 231
3048 Crowd-funding 1 Crowdfunding 106
3049 Dis-Taste 1 Distaste 65
3050 Race-Course 1 Racecourse 151
3051 Air-Stream 1 Airstream 263
3052 Prime-Test 1 Primetest 12
3053 Gate-Fold 1 Gatefold 50
3054 Jump-suits 1 Jumpsuits 21
3055 Non-Cooperation 1 Noncooperation 22
3056 Sling-shot 1 Slingshot 149
3057 Cuppy-Cake 1 Cuppycake 43
3058 bare-Metal 1 baremetal 12
3059 Chop-Sticks 1 Chopsticks 39
3060 Auto-Configuration 1 Autoconfiguration 15
3061 Cyber-crime 1 Cybercrime 46
3062 sub-Text 1 subtext 10775
3063 Space-time 1 Spacetime 171
3064 Real-Gymnasium 1 Realgymnasium 37
3065 Non-Adherence 1 Nonadherence 13
3066 Blue-Book 1 Bluebook 432
3067 Ice-Rocket 1 Icerocket 19
3068 fairy-Tale 1 fairytale 9351
3069 Off-Stage 1 Offstage 14
3070 Euro-Sport 1 Eurosport 1740
3071 Red-Fish 1 Redfish 159
3072 Weight-loss 1 Weightloss 12
3073 Free-Space 1 Freespace 32
3074 Bit-Torrent 1 Bittorrent 24
3075 Re-Sale 1 Resale 263

3076 rows × 4 columns

Morphological productivity

  • "the possibility for language users to coin, unintentionally, a number of formations that is, in principle, enumerably infinite" (Baayen, 1994b, p. 452, citing Schultink, 1961)
  • In English, -ness is a productive suffix: We can come up with many plausible, interpretable words with the ending: longness, deepness, wideness, sameness, otherness, Frenchness, linguisticness, etc.
  • -th is less productive: length, depth, width, but *samth, *otherth, *Frenchth, *linguisticth, etc.
  • In German, the adjectival suffix -bar is more productive than -sam or -ös, for example (Lüdeling, Evert & Heid, 2000)

Quantifying productivity (Baayen 1993, 1994a, 1994b, 2001, 2003)

Baayen's productivity measures, e.g.

If $V$ is a type and $N$ the size of the corpus in tokens

$\mathscr{P} = \frac{V_{N}(1,c)}{N_c}$ = Category-conditioned degree of productivity

  • The ratio of of hapax legomena (words that occur only once in a text or corpus) belonging to a morphological category $c$ to the total number of tokens for that morphological type

  • The measure has mainly been used to quantify the productivity of affixes:

$\mathscr{P}$ is high for -ness, low for -th

  • We consider productivity of anglicism bases in hyphenated German compounds:

  • Does Feeling occur in a large number of hyphenated compounds?

Urlaubs-Feeling, Speckfest-Feeling, Live-Feeling, Kino-Feeling, Karibik-Feeling, Retro-Feeling, Sommer-Feeling, Cabrio-Feeling, Blues-Feeling, WM-Feeling, Südsee-Feeling, Disco-Feeling, HD-Feeling, Open-air-Feeling, Afrika-Feeling, etc.

  • Bases can occur as right-hand constituents, or as left-hand constituents (less common):

Feeling-Seen, Feelings-Tattoo, Feeling-Theorie, Feeling-Musik, Feeling-Natur, Feeling-Style, Feeling-Party, Feeling-Zone, Feeling-Duft, Feeling-Programm, Feelings-Reihe, Feeling-Duftöl, Feeling-Edition, Feeling-Gitarrist, Feeling-Killer, Feeling-mäßig, etc.

  • Or as internal constituents:

Family-Feeling-Attribut, Karibik-Feeling-Duft, Motorrad-Feeling-Augsburg, Soft-Feeling-Lack, Summer-Feeling-Set, Lines-aus-alten-Texten-rappen-um-wieder-die-alten-Feelings-back-zu-holen, Summer-Feeling-Duft, Tempo-Feeling-Wettbewerb, etc.

  • A productivity measure needs to take into account not only the ability to coin new words (i.e. hapax ratios), but also the number of types in which a constituent occurs and the total frequencies of these types (Baayen & Hay, 2002; Baayen, Lieber, & Schreuder, 1997; Baayen, Wurm & Aycock, 2007; De Jong, Schreuder & Baayen, 2000; Hay, 2001; Schreuder & Baayen, 1997)

  • Morphological family size: The number of distinct types in the corpus that contain a particular base

  • Cumulative family frequency: the aggregate sum of frequencies of all types that contain a particular base

Shannon entropy

$$H_B = -\sum\limits^{n}_{i=1}\frac{F(x_{i})}{F_B}\cdot log_{2}\frac{F(x_{i})}{F_B}$$
  • Shannon entropy is the average amount of information, in bits, we need to represent the possible configurations of a particular base that occurs in multiple hyphenated compounds with different frequencies (Shannon, 1948)

  • High entropy: base tends to occur in many compounds

  • Low entropy: base tends to occur in fewer compounds, or with very high frequencies in one or a few types

  • Maximum entropy: $log_{2}n$

  • Let's consider the base $show$

In [29]:
#Left-hand-constituent hyphenated compounds in the Twitter corpus for the type "show"

lfreqs=sorted([x for y in tw_compounds[tw_compounds["word"]=="show"]["left"].values for x in y], key = lambda x:x[1], reverse = True)
[('show-down', 27),
 ('show-act', 23),
 ('show-bühne', 22),
 ('show-room', 16),
 ('show-einlage', 12),
 ('show-programm', 9),
 ('show-format', 8),
 ('show-idee', 8),
 ('show-konzept', 8),
 ('show-auftritt', 7),
 ('show-effekt', 7),
 ('show-event', 6),
 ('show-opener', 6),
 ('show-schaumbad', 6),
 ('show-veranstaltung', 6),
 ('show-business', 5),
 ('show-kochen', 5),
 ('show-sensation', 5),
 ('show-time', 5),
 ('show-dance', 4),
 ('show-küche', 4),
 ('show-party', 4),
 ('show-termin', 4),
 ('show-training', 4),
 ('show-und', 4),
 ('show-acts', 3),
 ('show-auftritte', 3),
 ('show-band', 3),
 ('show-biz', 3),
 ('show-cooking', 3),
 ('show-director', 3),
 ('show-gipfel', 3),
 ('show-highlights', 3),
 ('show-kampf', 3),
 ('show-konkurrenz', 3),
 ('show-moderator', 3),
 ('show-off', 3),
 ('show-praktikant', 3),
 ('show-premiere', 3),
 ('show-prozess', 3),
 ('show-reel', 3),
 ('show-stopper', 3),
 ('show-susi', 3),
 ('show-tag', 3),
 ('show-termine', 3),
 ('show-unterhaltung', 3),
 ('show-verbot', 3),
 ('show-woche', 3),
 ('show-übernachtungs-package', 3),
 ('show-abend', 2),
 ('show-aufgüssen', 2),
 ('show-auftakt', 2),
 ('show-aufzeichnung', 2),
 ('show-backen', 2),
 ('show-cases', 2),
 ('show-charakter', 2),
 ('show-cocktailbar', 2),
 ('show-comeback', 2),
 ('show-day', 2),
 ('show-dino', 2),
 ('show-direktor', 2),
 ('show-einlagen', 2),
 ('show-ende', 2),
 ('show-erlebnis', 2),
 ('show-express', 2),
 ('show-feind', 2),
 ('show-finale', 2),
 ('show-folgen', 2),
 ('show-giganten', 2),
 ('show-girls', 2),
 ('show-glams', 2),
 ('show-inszenierung', 2),
 ('show-kandidaten', 2),
 ('show-legende', 2),
 ('show-man', 2),
 ('show-match', 2),
 ('show-mäßig', 2),
 ('show-notes', 2),
 ('show-preise', 2),
 ('show-präsidenten', 2),
 ('show-rekord', 2),
 ('show-specials', 2),
 ('show-talente', 2),
 ('show-tanz', 2),
 ('show-team', 2),
 ('show-tour', 2),
 ('show-truck', 2),
 ('show-truppe', 2),
 ('show-unterbrechung', 2),
 ('show-welt', 2),
 ('show-zeiten', 2),
 ('show-zirkus', 2),
 ('show-abbruch', 1),
 ('show-abschluss', 1),
 ('show-act-präsentation', 1),
 ('show-alpinismus', 1),
 ('show-alternative', 1),
 ('show-and-tell', 1),
 ('show-angebote', 1),
 ('show-ankündigung', 1),
 ('show-anteil', 1),
 ('show-apps', 1),
 ('show-archiv', 1),
 ('show-assi', 1),
 ('show-aus', 1),
 ('show-ausstieg', 1),
 ('show-autos', 1),
 ('show-außenreporter', 1),
 ('show-azubi', 1),
 ('show-ballett', 1),
 ('show-balletts', 1),
 ('show-band-prognose', 1),
 ('show-beginn', 1),
 ('show-beschreibung', 1),
 ('show-bestätigung', 1),
 ('show-besuch', 1),
 ('show-bilder', 1),
 ('show-blöcken', 1),
 ('show-boot', 1),
 ('show-branche', 1),
 ('show-buch', 1),
 ('show-buiness', 1),
 ('show-bühnen', 1),
 ('show-büro', 1),
 ('show-car', 1),
 ('show-cars', 1),
 ('show-case', 1),
 ('show-chance', 1),
 ('show-christen', 1),
 ('show-clown', 1),
 ('show-cocktail', 1),
 ('show-coden', 1),
 ('show-comedian', 1),
 ('show-contact', 1),
 ('show-contest', 1),
 ('show-darbietungen', 1),
 ('show-das', 1),
 ('show-depeche', 1),
 ('show-der', 1),
 ('show-design', 1),
 ('show-details', 1),
 ('show-doppler', 1),
 ('show-dreikampf', 1),
 ('show-duell', 1),
 ('show-effekt-diagramme', 1),
 ('show-effekte', 1),
 ('show-effekten', 1),
 ('show-einsatz', 1),
 ('show-eklat', 1),
 ('show-elementen', 1),
 ('show-ensemble', 1),
 ('show-erfahrung', 1),
 ('show-erfolgen', 1),
 ('show-erinnerungen', 1),
 ('show-eröffnung', 1),
 ('show-events', 1),
 ('show-experimente', 1),
 ('show-fabrik', 1),
 ('show-fahrer', 1),
 ('show-faktor', 1),
 ('show-fans', 1),
 ('show-feature', 1),
 ('show-feier', 1),
 ('show-feuerwerk', 1),
 ('show-film', 1),
 ('show-flat', 1),
 ('show-flieger', 1),
 ('show-flop', 1),
 ('show-flächen', 1),
 ('show-folge', 1),
 ('show-formaten', 1),
 ('show-freistöße', 1),
 ('show-frisör', 1),
 ('show-fälscher', 1),
 ('show-gast', 1),
 ('show-gedöns', 1),
 ('show-gefangenen', 1),
 ('show-gehversuche', 1),
 ('show-gelaber', 1),
 ('show-gen', 1),
 ('show-geschichte', 1),
 ('show-geschäft', 1),
 ('show-geschäfts', 1),
 ('show-gesicht', 1),
 ('show-gewinner', 1),
 ('show-gott', 1),
 ('show-grillen', 1),
 ('show-größen', 1),
 ('show-gute', 1),
 ('show-gutschein', 1),
 ('show-gäste', 1),
 ('show-hahnenkämpfe', 1),
 ('show-haus', 1),
 ('show-heimspiel', 1),
 ('show-hemd', 1),
 ('show-highlight', 1),
 ('show-himmel', 1),
 ('show-host', 1),
 ('show-hypnose', 1),
 ('show-hypnotiseure', 1),
 ('show-hölle', 1),
 ('show-ich', 1),
 ('show-ideen', 1),
 ('show-inspiration', 1),
 ('show-installation', 1),
 ('show-justiz', 1),
 ('show-kacke', 1),
 ('show-kalauer', 1),
 ('show-kanäle', 1),
 ('show-karriere', 1),
 ('show-keeper', 1),
 ('show-kein', 1),
 ('show-kellner', 1),
 ('show-kenner', 1),
 ('show-klassiker', 1),
 ('show-klassikers', 1),
 ('show-kniffs', 1),
 ('show-kollaboratio', 1),
 ('show-kollegin', 1),
 ('show-kommentator', 1),
 ('show-kommerz-quatsch', 1),
 ('show-kompetenz', 1),
 ('show-kontrolle', 1),
 ('show-konzepte', 1),
 ('show-konzert', 1),
 ('show-krieg', 1),
 ('show-kulturkritik', 1),
 ('show-könig', 1),
 ('show-licht', 1),
 ('show-liga', 1),
 ('show-logos', 1),
 ('show-machen', 1),
 ('show-macher', 1),
 ('show-man-qualitäten', 1),
 ('show-marathon', 1),
 ('show-master', 1),
 ('show-matches', 1),
 ('show-men', 1),
 ('show-messer', 1),
 ('show-metzgerei', 1),
 ('show-metzgete', 1),
 ('show-missing', 1),
 ('show-mixen', 1),
 ('show-mobil', 1),
 ('show-mode', 1),
 ('show-moderatoren', 1),
 ('show-mohrenkönig-augsburg', 1),
 ('show-moment', 1),
 ('show-name', 1),
 ('show-namen', 1),
 ('show-nebel', 1),
 ('show-nur', 1),
 ('show-oberpraktikanten', 1),
 ('show-objekten', 1),
 ('show-offen', 1),
 ('show-offensive', 1),
 ('show-offs', 1),
 ('show-opening', 1),
 ('show-outfit', 1),
 ('show-overkill', 1),
 ('show-pad', 1),
 ('show-parkett', 1),
 ('show-pausen', 1),
 ('show-performance', 1),
 ('show-pferd', 1),
 ('show-phrasen', 1),
 ('show-piano', 1),
 ('show-pic', 1),
 ('show-piece', 1),
 ('show-playlist', 1),
 ('show-preis-irgendwas-geschäft', 1),
 ('show-preview', 1),
 ('show-prinzip', 1),
 ('show-proben', 1),
 ('show-produktion', 1),
 ('show-profi', 1),
 ('show-profil', 1),
 ('show-proze', 1),
 ('show-publikum', 1),
 ('show-quickie', 1),
 ('show-quiz', 1),
 ('show-rate', 1),
 ('show-razzia', 1),
 ('show-recherche', 1),
 ('show-redakteur', 1),
 ('show-redakteure', 1),
 ('show-redaktion', 1),
 ('show-reihe', 1),
 ('show-relaunch', 1),
 ('show-report', 1),
 ('show-reporterin', 1),
 ('show-revue', 1),
 ('show-rider', 1),
 ('show-roboter', 1),
 ('show-rooms', 1),
 ('show-routinier', 1),
 ('show-runnerin', 1),
 ('show-röstung', 1),
 ('show-s', 1),
 ('show-saison', 1),
 ('show-salon', 1),
 ('show-satz', 1),
 ('show-scheiss', 1),
 ('show-schlaf', 1),
 ('show-shootings', 1),
 ('show-skandal', 1),
 ('show-slot', 1),
 ('show-solo', 1),
 ('show-sommer', 1),
 ('show-song', 1),
 ('show-songs', 1),
 ('show-special', 1),
 ('show-spektakel', 1),
 ('show-spiel', 1),
 ('show-sponsors', 1),
 ('show-sports', 1),
 ('show-staffel', 1),
 ('show-start', 1),
 ('show-steheln', 1),
 ('show-sternchen', 1),
 ('show-steuerung', 1),
 ('show-stunden', 1),
 ('show-stärke', 1),
 ('show-suppe', 1),
 ('show-sus', 1),
 ('show-talent', 1),
 ('show-talk', 1),
 ('show-taschen', 1),
 ('show-tassen', 1),
 ('show-teil', 1),
 ('show-terminen', 1),
 ('show-tester', 1),
 ('show-texte', 1),
 ('show-ticker', 1),
 ('show-tiere', 1),
 ('show-time-programm', 1),
 ('show-tipp', 1),
 ('show-titanen', 1),
 ('show-tournee', 1),
 ('show-trailer', 1),
 ('show-treppe', 1),
 ('show-trick', 1),
 ('show-tunnel', 1),
 ('show-turnier', 1),
 ('show-turnieren', 1),
 ('show-tv-debatte', 1),
 ('show-typen', 1),
 ('show-unglück', 1),
 ('show-unterricht', 1),
 ('show-urgesteins', 1),
 ('show-veransta', 1),
 ('show-version', 1),
 ('show-versionen', 1),
 ('show-view', 1),
 ('show-view-programmierung', 1),
 ('show-voice', 1),
 ('show-vortrag', 1),
 ('show-wahlkampfabschluss', 1),
 ('show-wahlkreis', 1),
 ('show-wahnsinn', 1),
 ('show-wette', 1),
 ('show-wetter', 1),
 ('show-wiederholungen', 1),
 ('show-wintersportereignis', 1),
 ('show-wochenende', 1),
 ('show-yo', 1),
 ('show-zeitungsständer', 1),
 ('show-zwecke', 1),
 ('show-zweikampf', 1),
 ('show-übernachtungs-gutschein', 1),
 ('show-überraschung', 1),
 ('show-übersicht', 1),
 ('shows-musik-sport', 1),
 ('shows-sensation', 1)]
In [30]:
#Internal-constituent hyphenated compounds in the Twitter corpus for the type "show"
ifreqs=sorted([x for y in tw_compounds[tw_compounds["word"]=="show"]["intern"].values for x in y], key = lambda x:x[1], reverse = True)
[('after-show-party', 122),
 ('no-show-rate', 14),
 ('satire-show-talk', 4),
 ('best-of-show-award', 3),
 ('no-show-quote', 3),
 ('pre-show-party', 3),
 ('after-show-dinner', 2),
 ('after-show-movie', 2),
 ('after-show-parties', 2),
 ('after-show-partys', 2),
 ('csd-after-show-party', 2),
 ('heute-show-battle', 2),
 ('heute-show-vom', 2),
 ('no-show-gebühr', 2),
 ('off-show-termin', 2),
 ('reality-show-star', 2),
 ('rock-show-produktion', 2),
 ('silvester-show-gala', 2),
 ('slide-show-standup', 2),
 ('talk-show-auftritte', 2),
 ('youtube-show-tag', 2),
 ('af-ter-show-par-tyyyyy', 1),
 ('afder-show-party', 1),
 ('after-show-bier', 1),
 ('after-show-bierchen', 1),
 ('after-show-deutschrock-party', 1),
 ('after-show-drink', 1),
 ('after-show-drinks', 1),
 ('after-show-event', 1),
 ('after-show-frühstück', 1),
 ('after-show-gespräche', 1),
 ('after-show-interview', 1),
 ('after-show-kuchen', 1),
 ('after-show-lounge', 1),
 ('after-show-parteitag', 1),
 ('after-show-sendungskritik', 1),
 ('after-show-session', 1),
 ('after-show-talk', 1),
 ('after-show-video', 1),
 ('armory-show-messebericht', 1),
 ('award-show-kino', 1),
 ('big-show-comeback', 1),
 ('big-show-spaß', 1),
 ('bismarck-after-show-party', 1),
 ('bundesliga-show-m', 1),
 ('bus-live-show-nacht', 1),
 ('casting-show-finales', 1),
 ('casting-show-idee', 1),
 ('casting-show-kandidaten', 1),
 ('casting-show-settings', 1),
 ('casting-show-sieger', 1),
 ('casting-show-verhältnisse', 1),
 ('casting-show-zweite', 1),
 ('daily-show-korrespondentensegment', 1),
 ('dating-show-trauma', 1),
 ('doppel-transen-show-dirigieren', 1),
 ('downloade-alle-folgen-regular-show-von-kisscartoons', 1),
 ('ego-show-num', 1),
 ('fashion-show-hauptstadt', 1),
 ('fernseh-show-moderation', 1),
 ('fischer-after-show-party', 1),
 ('football-show-dings', 1),
 ('game-show-moderator-darsteller', 1),
 ('ge-slam-show-t', 1),
 ('harald-schmidt-show-ausgaben', 1),
 ('haupt-show-act', 1),
 ('heute-show-ausschnitt', 1),
 ('heute-show-beitrag', 1),
 ('heute-show-fanshop', 1),
 ('heute-show-haut', 1),
 ('heute-show-moderator', 1),
 ('heute-show-mäßiges', 1),
 ('heute-show-namensschild', 1),
 ('heute-show-plakat', 1),
 ('heute-show-praktikant', 1),
 ('heute-show-praktikantin', 1),
 ('heute-show-reporter', 1),
 ('heute-show-star', 1),
 ('heute-show-studio', 1),
 ('heute-show-teams', 1),
 ('heute-show-voice', 1),
 ('heute-show-witzen', 1),
 ('karaoke-show-moderation', 1),
 ('kasper-show-trailer', 1),
 ('klitschko-show-spektakel', 1),
 ('koch-show-wahnsinn', 1),
 ('kuppel-show-kandidatin', 1),
 ('kuppel-show-verbot', 1),
 ('late-night-show-ding', 1),
 ('late-night-show-redaktionen', 1),
 ('late-show-regel', 1),
 ('late-show-versuche', 1),
 ('latenight-show-job', 1),
 ('liga-show-veranstaltung', 1),
 ('live-show-entertainment', 1),
 ('live-show-schriftsteller', 1),
 ('live-show-schuh', 1),
 ('mach-show-konzept', 1),
 ('mini-playback-show-gewinner', 1),
 ('mini-playback-show-karte', 1),
 ('mini-playback-show-moderatorin', 1),
 ('mini-playback-show-teilnehmer', 1),
 ('mini-playback-show-tür', 1),
 ('morning-show-leute', 1),
 ('no-show-gebühren', 1),
 ('no-show-liste', 1),
 ('no-show-passagier', 1),
 ('no-show-problematik', 1),
 ('no-show-rates', 1),
 ('off-show-tag', 1),
 ('one-man-show-argument', 1),
 ('one-man-show-rpg', 1),
 ('one-man-show-videos', 1),
 ('one-show-interactive-jury', 1),
 ('peep-show-stange', 1),
 ('percussion-show-klassiker', 1),
 ('pod-casting-show-produzent', 1),
 ('polit-show-bühne-berlin', 1),
 ('polit-show-konzept', 1),
 ('polit-show-typen', 1),
 ('pop-show-sexismus', 1),
 ('post-show-bier-degustier-stream', 1),
 ('pre-show-backstage-office', 1),
 ('pre-show-partys', 1),
 ('pre-show-stream', 1),
 ('pre-show-videos', 1),
 ('quiz-show-geständnis', 1),
 ('radio-show-konzept', 1),
 ('reality-show-hintergrundmusik', 1),
 ('reality-show-stammseher', 1),
 ('reality-show-teilnehmer', 1),
 ('reality-show-teleshopping-nachgeschmack', 1),
 ('rock-show-production', 1),
 ('rocky-horror-picture-show-outfits', 1),
 ('rocky-horror-picture-show-song', 1),
 ('samstag-abend-show-moderatoren-urgesteine', 1),
 ('sommer-show-wochen', 1),
 ('sommer-sonnen-samstags-show-spaß', 1),
 ('special-show-act', 1),
 ('start-up-show-tweets', 1),
 ('super-show-format', 1),
 ('talk-show-auftritt', 1),
 ('talk-show-gast', 1),
 ('talk-show-junkie', 1),
 ('talk-show-marathon', 1),
 ('talk-show-pause', 1),
 ('talk-show-rubrik', 1),
 ('talk-show-runde', 1),
 ('talk-show-szene', 1),
 ('talk-show-teilnahme', 1),
 ('talk-show-tingler', 1),
 ('tellonym-show-offs', 1),
 ('tonight-show-rubrik', 1),
 ('trash-show-revival', 1),
 ('trueman-show-like', 1),
 ('truman-show-momente', 1),
 ('tv-show-entscheidungen', 1),
 ('tv-show-episoden', 1),
 ('tv-show-kandidatin', 1),
 ('tv-show-pitch', 1),
 ('tv-show-produkte', 1),
 ('tv-show-setup', 1),
 ('tv-show-teilnehmer', 1),
 ('tv-show-tracker', 1),
 ('tv-show-urgestein', 1),
 ('uberspace-show-quota', 1),
 ('uhr-promi-game-show-liga', 1),
 ('videodays-after-show-party', 1),
 ('weihnachts-show-tour', 1),
 ('wong-show-special', 1),
 ('wunsch-show-stunden', 1),
 ('youtube-late-show-routine', 1),
 ('youtube-show-interview', 1),
 ('zirkus-show-pferd', 1)]
In [31]:
#Right-hand-constituent hyphenated compounds in the Twitter corpus for the type "show"
rfreqs=sorted([x for y in tw_compounds[tw_compounds["word"]=="show"]["right"].values for x in y], key = lambda x:x[1], reverse = True)
[('heute-show', 730),
 ('tv-show', 338),
 ('youtube-show', 231),
 ('live-show', 215),
 ('pre-show', 113),
 ('one-man-show', 112),
 ('tv-shows', 97),
 ('late-night-show', 64),
 ('casting-show', 61),
 ('live-shows', 60),
 ('comedy-show', 58),
 ('rtl-show', 55),
 ('talk-show', 44),
 ('after-show', 43),
 ('halbzeit-show', 41),
 ('dia-show', 40),
 ('halftime-show', 40),
 ('zdf-show', 40),
 ('reality-show', 39),
 ('muppet-show', 37),
 ('casting-shows', 35),
 ('pr-show', 34),
 ('talk-shows', 32),
 ('fashion-show', 31),
 ('no-shows', 31),
 ('morning-show', 30),
 ('koch-show', 27),
 ('helene-fischer-show', 26),
 ('quiz-show', 26),
 ('radio-show', 26),
 ('dinner-show', 24),
 ('no-show', 24),
 ('satire-show', 23),
 ('samstagabend-show', 21),
 ('maschmeyer-show', 20),
 ('mini-playback-show', 20),
 ('moin-show', 20),
 ('pyro-show', 19),
 ('berlin-show', 18),
 ('freak-show', 18),
 ('multimedia-show', 18),
 ('stuggi-show', 18),
 ('dj-show', 17),
 ('horror-show', 17),
 ('multivisions-show', 17),
 ('truman-show', 17),
 ('ard-show', 16),
 ('die-aktuelle-heute-show', 16),
 ('fischer-show', 16),
 ('heise-show', 16),
 ('mega-show', 16),
 ('brieflos-show', 15),
 ('call-in-show', 15),
 ('impro-show', 15),
 ('olympia-show', 15),
 ('solo-show', 15),
 ('award-show', 14),
 ('dating-show', 14),
 ('mario-barth-show', 14),
 ('one-woman-show', 14),
 ('varieté-show', 14),
 ('auto-show', 13),
 ('award-shows', 13),
 ('chart-show', 13),
 ('ego-show', 13),
 ('koch-shows', 13),
 ('news-show', 13),
 ('primetime-show', 13),
 ('harald-schmidt-show', 12),
 ('kult-show', 12),
 ('abschluss-show', 11),
 ('deutschland-show', 11),
 ('entscheidungs-show', 11),
 ('muppets-show', 11),
 ('pre-game-show', 11),
 ('prosieben-show', 11),
 ('reality-shows', 11),
 ('trump-show', 11),
 ('burlesque-show', 10),
 ('fernseh-show', 10),
 ('gründer-show', 10),
 ('jubiläums-show', 10),
 ('messi-show', 10),
 ('reality-tv-show', 10),
 ('release-show', 10),
 ('tdc-show', 10),
 ('us-show', 10),
 ('apple-show', 9),
 ('ein-mann-show', 9),
 ('freestyle-show', 9),
 ('game-show', 9),
 ('kommentare-kommentier-show', 9),
 ('podcast-show', 9),
 ('polit-show', 9),
 ('propaganda-show', 9),
 ('retro-show', 9),
 ('rtl-shows', 9),
 ('samwer-show', 9),
 ('start-up-show', 9),
 ('vox-show', 9),
 ('weihnachts-show', 9),
 ('dating-shows', 8),
 ('karaoke-show', 8),
 ('laser-show', 8),
 ('mode-show', 8),
 ('monster-truck-show', 8),
 ('raab-show', 8),
 ('ranking-shows', 8),
 ('seelöwen-show', 8),
 ('silvester-show', 8),
 ('kickoff-show', 7),
 ('knoff-hoff-show', 7),
 ('light-show', 7),
 ('model-show', 7),
 ('pazvanti-show', 7),
 ('satire-shows', 7),
 ('startup-show', 7),
 ('technik-show', 7),
 ('wahlkampf-show', 7),
 ('bundesliga-show', 6),
 ('carpet-show', 6),
 ('club-shows', 6),
 ('comedy-shows', 6),
 ('countdown-show', 6),
 ('esc-show', 6),
 ('feuer-show', 6),
 ('gratis-show', 6),
 ('half-time-show', 6),
 ('härtsplatt-show', 6),
 ('ice-show', 6),
 ('lese-show', 6),
 ('licht-show', 6),
 ('matchday-show', 6),
 ('oscar-show', 6),
 ('red-carpet-show', 6),
 ('schmidt-show', 6),
 ('silbereisen-show', 6),
 ('super-show', 6),
 ('tanz-show', 6),
 ('vorher-nachher-show', 6),
 ('wahl-show', 6),
 ('wuki-show', 6),
 ('zuckerberg-show', 6),
 ('abba-show', 5),
 ('behindert-show', 5),
 ('bolt-show', 5),
 ('em-show', 5),
 ('fake-show', 5),
 ('hammer-show', 5),
 ('highlight-show', 5),
 ('hypnose-show', 5),
 ('ich-show', 5),
 ('jass-show', 5),
 ('kb-show', 5),
 ('latenight-show', 5),
 ('lindner-show', 5),
 ('mark-show', 5),
 ('morgen-show', 5),
 ('motivations-show', 5),
 ('night-show', 5),
 ('opening-show', 5),
 ('orf-show', 5),
 ('peep-show', 5),
 ('personality-show', 5),
 ('post-show', 5),
 ('prime-time-show', 5),
 ('promi-show', 5),
 ('quest-show', 5),
 ('slam-show', 5),
 ('top-show', 5),
 ('trampolin-show', 5),
 ('trash-show', 5),
 ('wahnsinns-show', 5),
 ('warm-up-show', 5),
 ('wdr-show', 5),
 ('wiwaldi-show', 5),
 ('wrestling-show', 5),
 ('abig-show', 4),
 ('akustik-show', 4),
 ('album-shows', 4),
 ('beatles-show', 4),
 ('bikini-show', 4),
 ('bookclub-show', 4),
 ('böhmermann-show', 4),
 ('comeback-show', 4),
 ('deutschland-shows', 4),
 ('dia-shows', 4),
 ('drei-länder-show', 4),
 ('elvis-show', 4),
 ('flug-show', 4),
 ('game-shows', 4),
 ('gaming-show', 4),
 ('geburtstags-show', 4),
 ('geo-show', 4),
 ('gop-show', 4),
 ('hamburg-show', 4),
 ('hip-hop-show', 4),
 ('hz-show', 4),
 ('kappa-show', 4),
 ('las-vegas-show', 4),
 ('late-night-shows', 4),
 ('mawev-show', 4),
 ('musik-show', 4),
 ('neuheiten-show', 4),
 ('ng-show', 4),
 ('obama-show', 4),
 ('oldtimer-show', 4),
 ('online-show', 4),
 ('open-air-shows', 4),
 ('pferde-show', 4),
 ('piep-show', 4),
 ('playback-show', 4),
 ('pocher-show', 4),
 ('politik-show', 4),
 ('ps-gala-show', 4),
 ('punkrock-show', 4),
 ('puppen-impro-show', 4),
 ('putin-show', 4),
 ('ranking-show', 4),
 ('revue-show', 4),
 ('samstagabend-shows', 4),
 ('sketch-show', 4),
 ('solarcar-show', 4),
 ('stand-up-show', 4),
 ('sternennacht-show', 4),
 ('stunt-show', 4),
 ('travestie-show', 4),
 ('tribute-show', 4),
 ('tv-karaoke-show', 4),
 ('twitter-show', 4),
 ('web-show', 4),
 ('action-show', 3),
 ('adriana-del-rossi-show', 3),
 ('air-show', 3),
 ('album-release-show', 3),
 ('an-show', 3),
 ('aussichten-show', 3),
 ('back-shows', 3),
 ('begeisterungs-show', 3),
 ('betten-show', 3),
 ('bisschenhorror-show', 3),
 ('bühnen-show', 3),
 ('call-in-shows', 3),
 ('catwalk-show', 3),
 ('ceylan-show', 3),
 ('chemie-show', 3),
 ('club-show', 3),
 ('doppelgänger-show', 3),
 ('echo-show', 3),
 ('esc-shows', 3),
 ('eurovision-show', 3),
 ('facebook-show', 3),
 ('feuerwerk-show', 3),
 ('fil-show', 3),
 ('final-show', 3),
 ('flirt-show', 3),
 ('foto-show', 3),
 ('free-show', 3),
 ('fußball-show', 3),
 ('gala-show', 3),
 ('gong-show', 3),
 ('gottschalk-show', 3),
 ('headliner-show', 3),
 ('headliner-shows', 3),
 ('hiphop-show', 3),
 ('honey-show', 3),
 ('ii-show', 3),
 ('iphone-show', 3),
 ('jamaika-show', 3),
 ('karlich-show', 3),
 ('klimarettungs-show', 3),
 ('krokodil-show', 3),
 ('kruse-show', 3),
 ('kuppel-show', 3),
 ('kuppel-shows', 3),
 ('köln-show', 3),
 ('lammfell-show', 3),
 ('lanz-show', 3),
 ('led-show', 3),
 ('leistungs-show', 3),
 ('lieblings-show', 3),
 ('life-show', 3),
 ('mitternachts-show', 3),
 ('mo-show', 3),
 ('motorrad-shows', 3),
 ('musical-show', 3),
 ('musik-shows', 3),
 ('nordkorea-show', 3),
 ('off-shows', 3),
 ('one-artist-show', 3),
 ('one-man-shows', 3),
 ('orca-show', 3),
 ('pannen-show', 3),
 ('pep-show', 3),
 ('ps-show', 3),
 ('rammstein-tribute-show', 3),
 ('rhythmik-show', 3),
 ('road-show', 3),
 ('robben-show', 3),
 ('rockstar-show', 3),
 ('ronaldo-show', 3),
 ('ros-show', 3),
 ('ryf-show', 3),
 ('samstag-abend-show', 3),
 ('schmidt-karaoke-show', 3),
 ('scripted-reality-shows', 3),
 ('selfmade-show', 3),
 ('single-show', 3),
 ('skl-show', 3),
 ('slide-show', 3),
 ('solo-shows', 3),
 ('sport-comedy-show', 3),
 ('strip-show', 3),
 ('strip-shows', 3),
 ('teleshopping-show', 3),
 ('tor-show', 3),
 ('trash-shows', 3),
 ('truck-show', 3),
 ('unplugged-show', 3),
 ('update-show', 3),
 ('us-late-night-show', 3),
 ('us-shows', 3),
 ('vdp-leistungs-show', 3),
 ('vegas-show', 3),
 ('vpt-show', 3),
 ('vr-show', 3),
 ('witze-show', 3),
 ('xxl-show', 3),
 ('xxx-show', 3),
 ('zauber-show', 3),
 ('überrauschungs-show', 3),
 ('acoustic-show', 2),
 ('acoustic-shows', 2),
 ('adam-riese-show', 2),
 ('air-shows', 2),
 ('akrobatik-show', 2),
 ('amazon-show', 2),
 ('aqua-show', 2),
 ('ard-propaganda-show', 2),
 ('arena-show', 2),
 ('audio-slide-shows', 2),
 ('auslosungs-show', 2),
 ('auto-shows', 2),
 ('back-show', 2),
 ('baller-show', 2),
 ('balmain-show', 2),
 ('battle-show', 2),
 ('bayern-show', 2),
 ('beauty-show', 2),
 ('bellrays-show', 2),
 ('benefiz-show', 2),
 ('berlin-shows', 2),
 ('best-of-show', 2),
 ('bilder-show', 2),
 ('bingo-show', 2),
 ('bluebox-show', 2),
 ('bollywood-show', 2),
 ('breakdance-show', 2),
 ('bundeswehr-reality-show', 2),
 ('bunny-show', 2),
 ('chanel-show', 2),
 ('chanell-devero-show', 2),
 ('chart-shows', 2),
 ('circus-show', 2),
 ('clown-show', 2),
 ('clueso-show', 2),
 ('coaching-doku-show', 2),
 ('container-show', 2),
 ('custombike-show', 2),
 ('dance-show', 2),
 ('delphin-show', 2),
 ('design-show', 2),
 ('dessous-show', 2),
 ('deutschrap-show', 2),
 ('dia-multi-visions-show', 2),
 ('die-show', 2),
 ('dino-show', 2),
 ('dj-shows', 2),
 ('dreyer-show', 2),
 ('drift-show', 2),
 ('drohnen-show', 2),
 ('dunking-show', 2),
 ('duo-show', 2),
 ('ein-frau-show', 2),
 ('einmann-show', 2),
 ('eis-show', 2),
 ('ejakulator-show', 2),
 ('elmi-radio-show', 2),
 ('erdogan-show', 2),
 ('europa-shows', 2),
 ('family-show', 2),
 ('fashion-shows', 2),
 ('fernseh-shows', 2),
 ('festival-show', 2),
 ('festival-shows', 2),
 ('flamenco-show', 2),
 ('fnord-news-show', 2),
 ('fnord-show', 2),
 ('folklore-show', 2),
 ('football-show', 2),
 ('gesundheits-show', 2),
 ('gif-show', 2),
 ('glitzer-show', 2),
 ('greifvogel-show', 2),
 ('grill-show', 2),
 ('groko-show', 2),
 ('hair-show', 2),
 ('hans-jessen-show', 2),
 ('hardcore-show', 2),
 ('hardcore-shows', 2),
 ('headline-show', 2),
 ('heidi-klum-show', 2),
 ('helene-fischer-shows', 2),
 ('helene-show', 2),
 ('herzblatt-show', 2),
 ('holografie-show', 2),
 ('home-makeover-show', 2),
 ('hr-show', 2),
 ('impro-langformat-show', 2),
 ('impro-shows', 2),
 ('indie-show', 2),
 ('instagram-show', 2),
 ('internet-show', 2),
 ('interview-show', 2),
 ('it-show', 2),
 ('killerwal-show', 2),
 ('klopp-show', 2),
 ('klubbing-show', 2),
 ('klum-show', 2),
 ('koch-late-night-show', 2),
 ('krimi-show', 2),
 ('landgren-shows', 2),
 ('late-show', 2),
 ('latenight-shows', 2),
 ('lecture-show', 2),
 ('leo-show', 2),
 ('letterman-show', 2),
 ('literatur-show', 2),
 ('live-talk-show', 2),
 ('luther-show', 2),
 ('lämmer-show', 2),
 ('lügendetektor-show', 2),
 ('macron-show', 2),
 ('magie-show', 2),
 ('mallorca-show', 2),
 ('man-show', 2),
 ('mann-show', 2),
 ('marketing-show', 2),
 ('maus-show', 2),
 ('mcfly-show', 2),
 ('mdr-osterfeuer-show', 2),
 ('mdr-show', 2),
 ('media-casting-show', 2),
 ('microsoft-show', 2),
 ('miniplayback-show', 2),
 ('minuten-show', 2),
 ('mockridge-show', 2),
 ('motto-shows', 2),
 ('multimedia-shows', 2),
 ('music-show', 2),
 ('musik-comedy-show', 2),
 ('mystery-show', 2),
 ('müller-show', 2),
 ('münchen-show', 2),
 ('nacht-show', 2),
 ('nackt-selfi-show', 2),
 ('nackt-show', 2),
 ('nazi-show', 2),
 ('nerd-show', 2),
 ('netflix-show', 2),
 ('netflix-shows', 2),
 ('nikolaus-show', 2),
 ('november-show', 2),
 ('off-show', 2),
 ('one-ei-show', 2),
 ('one-women-show', 2),
 ('open-air-release-show', 2),
 ('open-air-show', 2),
 ('oppong-show', 2),
 ('orchester-show', 2),
 ('panel-show', 2),
 ('panzer-show', 2),
 ('park-show', 2),
 ('peer-show', 2),
 ('pimmel-show', 2),
 ('playmobil-show', 2),
 ('poledance-show', 2),
 ('polizei-show', 2),
 ('pop-show', 2),
 ('populismus-show', 2),
 ('power-show', 2),
 ('pre-game-shows', 2),
 ('pre-shows', 2),
 ('pregame-show', 2),
 ('proberaum-show', 2),
 ('promo-show', 2),
 ('ptv-show', 2),
 ('raab-shows', 2),
 ('rate-show', 2),
 ('reality-tv-shows', 2),
 ('reunion-show', 2),
 ('revival-show', 2),
 ('robot-show', 2),
 ('rocko-schamoni-show', 2),
 ('rocky-horror-show', 2),
 ('rollstuhltanzsport-show', 2),
 ('rossi-show', 2),
 ('runway-show', 2),
 ('samstag-abend-shows', 2),
 ('sauf-show', 2),
 ('scheinwerfer-show', 2),
 ('schungs-show', 2),
 ('science-show', 2),
 ('seehofer-show', 2),
 ('sex-show', 2),
 ('shit-show', 2),
 ('signatur-show', 2),
 ('ski-show', 2),
 ('slide-shows', 2),
 ('soli-show', 2),
 ('solidaritäts-show', 2),
 ('sommer-show', 2),
 ('spiel-show', 2),
 ('sport-shows', 2),
 ('standup-show', 2),
 ('star-show', 2),
 ('striptease-shows', 2),
 ('support-show', 2),
 ('support-shows', 2),
 ('tabaluga-show', 2),
 ('talent-show', 2),
 ('tech-show', 2),
 ('tegel-show', 2),
 ('theater-show', 2),
 ('theater-talk-show', 2),
 ('themen-show', 2),
 ('tischtennis-show', 2),
 ('tivoli-show', 2),
 ('trailer-show', 2),
 ('trödel-show', 2),
 ('tuning-show', 2),
 ('tv-casting-shows', 2),
 ('tv-flirt-show', 2),
 ('uk-shows', 2),
 ('urwahl-casting-show', 2),
 ('varité-show', 2),
 ('verarsche-show', 2),
 ('verkaufs-show', 2),
 ('video-show', 2),
 ('video-shows', 2),
 ('videospiel-show', 2),
 ('vier-tore-show', 2),
 ('warm-up-shows', 2),
 ('warmup-show', 2),
 ('warmup-shows', 2),
 ('wein-show', 2),
 ('wien-show', 2),
 ('wintergarten-show', 2),
 ('wissens-show', 2),
 ('wissenschafts-show', 2),
 ('wm-show', 2),
 ('wochenend-show', 2),
 ('wohlfühl-show', 2),
 ('wüsten-show', 2),
 ('xmas-show', 2),
 ('yt-show', 2),
 ('überraschungs-show', 2),
 ('abba-tribute-show', 1),
 ('abdollahi-late-night-show', 1),
 ('abschiebe-show', 1),
 ('abschieds-show', 1),
 ('abschluss-shows', 1),
 ('abstimmungs-show', 1),
 ('ac-show', 1),
 ('aci-icehouse-show', 1),
 ('ackern-show', 1),
 ('action-reality-show', 1),
 ('adele-show', 1),
 ('adoptions-show', 1),
 ('advent-show', 1),
 ('advents-show', 1),
 ('afd-propaganda-show', 1),
 ('affen-show', 1),
 ('afrika-show', 1),
 ('after-shows', 1),
 ('akani-show', 1),
 ('akkustik-show', 1),
 ('akrobatik-zirkus-oldtime-jazz-show', 1),
 ('album-show', 1),
 ('alk-talk-show', 1),
 ('all-male-show', 1),
 ('alpaka-show', 1),
 ('amazon-auto-show', 1),
 ('amy-show', 1),
 ('anal-show', 1),
 ('android-show', 1),
 ('animations-tv-show', 1),
 ('anna-netrebko-show', 1),
 ('anschluss-shows', 1),
 ('anti-europa-cameron-show', 1),
 ('anti-trump-show', 1),
 ('antwerp-fashion-show', 1),
 ('apassionata-show', 1),
 ('apfel-show', 1),
 ('apologeten-reinwaschungs-anbiederungs-show', 1),
 ('apple-late-night-show', 1),
 ('araber-show', 1),
 ('ard-talk-shows', 1),
 ('ard-tv-show', 1),
 ('ard-vorabend-show', 1),
 ('arena-shows', 1),
 ('arenen-shows', 1),
 ('armory-show', 1),
 ('artus-show', 1),
 ('aschermittwochs-show', 1),
 ('askgaryvee-show', 1),
 ('assi-tv-shows', 1),
 ('atv-show', 1),
 ('audio-show', 1),
 ('audio-shows', 1),
 ('audiovision-show', 1),
 ('austriasnexttopleader-show', 1),
 ('auswanderer-trash-shows', 1),
 ('auto-ad-shows', 1),
 ('auto-angeber-prollo-show', 1),
 ('auto-horror-show', 1),
 ('awards-show', 1),
 ('awesome-coldmirror-show', 1),
 ('ayurveda-koch-show', 1),
 ('azad-show', 1),
 ('badewannen-show', 1),
 ('baku-show', 1),
 ('balder-show', 1),
 ('balder-shows', 1),
 ('balkon-show', 1),
 ('ballet-show', 1),
 ('ballett-show', 1),
 ('band-show', 1),
 ('bang-die-josy-show', 1),
 ('bar-show', 1),
 ('barbara-karlich-show', 1),
 ('barca-show', 1),
 ('barney-show', 1),
 ('barry-show', 1),
 ('barth-show', 1),
 ('bashing-show', 1),
 ('basler-show', 1),
 ('basquiat-show', 1),
 ('bastian-show', 1),
 ('batman-shows', 1),
 ('batschkapp-show', 1),
 ('batshuayi-show', 1),
 ('battle-shows', 1),
 ('bbc-show', 1),
 ('beatbox-show', 1),
 ('beauty-shows', 1),
 ('beckmann-show', 1),
 ('beginner-show', 1),
 ('benefiz-dinner-show', 1),
 ('bengalo-show', 1),
 ('benny-hill-show', 1),
 ('berlinale-show', 1),
 ('berry-show', 1),
 ('berufs-show', 1),
 ('best-practice-show', 1),
 ('besten-esc-momente-show', 1),
 ('betrunkenen-show', 1),
 ('bett-show', 1),
 ('bewerbungs-show', 1),
 ('beyaz-show', 1),
 ('bibi-und-tina-show', 1),
 ('bierzelt-show', 1),
 ('bierzelt-shows', 1),
 ('big-shows', 1),
 ('bigalke-show', 1),
 ('biggar-show', 1),
 ('bike-shows', 1),
 ('biker-show', 1),
 ('bild-dschungel-show', 1),
 ('billen-show', 1),
 ('billig-wiederholungs-show', 1),
 ('bingo-live-show', 1),
 ('björk-show', 1),
 ('blab-show', 1),
 ('blackface-shows', 1),
 ('blackhat-show', 1),
 ('blitz-show', 1),
 ('blockbuster-shows', 1),
 ('blogger-runway-show', 1),
 ('blödel-show', 1),
 ('bmt-radio-show', 1),
 ('bmx-show', 1),
 ('bombi-show', 1),
 ('bonaparte-show', 1),
 ('bonus-show', 1),
 ('book-a-lindner-show', 1),
 ('book-show', 1),
 ('boring-show', 1),
 ('boris-show', 1),
 ('bosshoss-show', 1),
 ('boy-burlesque-show', 1),
 ('boyleske-show', 1),
 ('br-show', 1),
 ('brainpool-show', 1),
 ('brandschutz-show', 1),
 ('brautmoden-show', 1),
 ('breakfast-show', 1),
 ('breakmx-show', 1),
 ('brick-show', 1),
 ('britten-show', 1),
 ('broadcasting-gesprächs-shows', 1),
 ('broadway-show', 1),
 ('broadway-shows', 1),
 ('brock-osweiler-show', 1),
 ('brontosaurus-show', 1),
 ('bröhmann-show', 1),
 ('brüssel-show', 1),
 ('bubbel-haifisch-show', 1),
 ('buddy-holly-show', 1),
 ('buehnen-show', 1),
 ('bunga-bunga-show', 1),
 ('buzzfeed-show', 1),
 ('buzzindex-show', 1),
 ('bäppi-show', 1),
 ('bücherclub-show', 1),
 ('bühne-show', 1),
 ('bühnen-live-show', 1),
 ('büro-quiz-show', 1),
 ('ca-sting-show', 1),
 ('cabuwazi-show', 1),
 ('candy-crush-tv-show', 1),
 ('cardioentertainment-show', 1),
 ('carmen-nebel-show', 1),
 ('carmen-nebel-shows', 1),
 ('casper-shows', 1),
 ('castings-show', 1),
 ('catsing-show', 1),
 ('catwalk-shows', 1),
 ('caveman-show', 1),
 ('cavewoman-show', 1),
 ('cd-promo-show', 1),
 ('cdu-show', 1),
 ('cebit-show', 1),
 ('celebrity-shows', 1),
 ('ceta-show', 1),
 ('challenge-show', 1),
 ('charlie-grünhorn-show', 1),
 ('charlie-sheen-show', 1),
 ('check-in-show', 1),
 ('chelsea-flower-show', 1),
 ('chicharito-kießling-show', 1),
 ('china-show', 1),
 ('chippendales-show', 1),
 ('chris-show', 1),
 ('christen-show', 1),
 ('cirkus-show', 1),
 ('clip-shows', 1),
 ('clown-shows', 1),
 ('clownerie-show', 1),
 ('clowns-show', 1),
 ('club-bingo-shows', 1),
 ('cnn-prime-time-show', 1),
 ('collection-show', 1),
 ('comedy-game-show', 1),
 ('comedy-hypnose-show', 1),
 ('comedy-impro-talk-show', 1),
 ('comedy-sketch-shows', 1),
 ('comedy-wissenshow-show', 1),
 ('comod-show', 1),
 ('container-shows', 1),
 ('cop-show', 1),
 ('cosby-show', 1),
 ('couchsurfing-show', 1),
 ('coulthard-show', 1),
 ('countdown-shows', 1),
 ('couture-show', 1),
 ('cowboy-show', 1),
 ('crew-show', 1),
 ('dabbur-show', 1),
 ('dahlien-show', 1),
 ('daily-reality-show', 1),
 ('daily-show', 1),
 ('daniel-aminati-fitness-show', 1),
 ('date-show', 1),
 ('daten-show', 1),
 ('dave-grohl-solo-show', 1),
 ('ddr-retro-shows', 1),
 ('delfin-shows', 1),
 ('dema-show', 1),
 ('demand-shows', 1),
 ('demo-show', 1),
 ('demokratie-show', 1),
 ('derwisch-show', 1),
 ('dessous-shows', 1),
 ('dezember-show', 1),
 ('dfb-pokal-halbzeit-show', 1),
 ('dfb-show', 1),
 ('dfl-show', 1),
 ('dia-ton-show', 1),
 ('didi-show', 1),
 ('dienstags-show', 1),
 ('dieter-bohlen-show', 1),
 ('digitalisierungs-show', 1),
 ('dildo-show', 1),
 ('dingdong-show', 1),
 ('dingens-show', 1),
 ('dinner-krimi-show', 1),
 ('dinner-open-air-show', 1),
 ('dinner-shows', 1),
 ('dinosaurier-show', 1),
 ('dirndl-show', 1),
 ('discjockey-show', 1),
 ('divertimento-show', 1),
 ('diy-show', 1),
 ('diy-shows', 1),
 ('doggy-show', 1),
 ('doku-show', 1),
 ('donald-show', 1),
 ('donald-trump-gesundheits-show', 1),
 ('donnerstags-show', 1),
 ('doom-show', 1),
 ('doppel-show', 1),
 ('dortmund-show', 1),
 ('downtown-show', 1),
 ('drag-show', 1),
 ('dreamliner-show', 1),
 ('dschungel-show', 1),
 ('dsds-motto-show', 1),
 ('duell-show', 1),
 ('dunham-hipster-feminismus-show', 1),
 ('duschgel-show', 1),
 ('dümmel-show', 1),
 ('edeka-show', 1),
 ('edmund-stoiber-show', 1),
 ('ego-shows', 1),
 ('egotronic-show', 1),
 ('einertalk-show', 1),
 ('einlauf-show', 1),
 ('einpark-show', 1),
 ('ekel-show', 1),
 ('election-show', 1),
 ('elektro-live-show', 1),
 ('elektro-radio-show', 1),
 ('elsner-show', 1),
 ('eluveitie-show', 1),
 ('elvis-revival-show', 1),
 ('emo-show', 1),
 ('energie-show', 1),
 ('energiewende-show', 1),
 ('energy-show', 1),
 ('engel-show', 1),
 ('england-shows', 1),
 ('ensemble-show', 1),
 ('entertainment-shows', 1),
 ('entscheidungs-shows', 1),
 ('ep-shows', 1),
 ('epochen-show', 1),
 ('ermittler-shows', 1),
 ('ernstings-family-fashion-catwalk-show', 1),
 ('erotik-show', 1),
 ('eröffnungs-shows', 1),
 ('es-show', 1),
 ('esc-countdown-show', 1),
 ('essen-motor-show', 1),
 ('ester-show', 1),
 ('eu-show', 1),
 ('euro-show', 1),
 ('europa-mitkoch-show', 1),
 ('europa-show', 1),
 ('eurovisions-show', 1),
 ('eva-green-show', 1),
 ('event-shows', 1),
 ('ex-show', 1),
 ('ex-trump-show', 1),
 ('exit-show', 1),
 ('experiment-show', 1),
 ('experiment-shows', 1),
 ('experimente-show', 1),
 ('expo-show', 1),
 ('express-tv-show', 1),
 ('fabulara-show', 1),
 ('facebook-video-shows', 1),
 ('faisal-kawusi-show', 1),
 ('falkner-show', 1),
 ('falknerei-show', 1),
 ('familien-abend-show', 1),
 ('familien-shows', 1),
 ('familienbande-show', 1),
 ('fantastico-show', 1),
 ('fantasy-tv-show', 1),
 ('farewell-shows', 1),
 ('faschings-show', 1),
 ('fascho-show', 1),
 ('fasion-show', 1),
 ('fast-oben-ohne-show', 1),
 ('fastbreak-show', 1),
 ('fastenbrechen-shows', 1),
 ('fb-live-show', 1),
 ('fdp-show', 1),
 ('feature-show', 1),
 ('feste-shows', 1),
 ('festgepoppt-shows', 1),
 ('fetisch-fashion-show', 1),
 ('fetish-show', 1),
 ('feuer-wasser-show', 1),
 ('fifa-talk-show', 1),
 ('final-shows', 1),
 ('finanz-show', 1),
 ('fischerleni-show', 1),
 ('fitzek-show', 1),
 ('flac-show', 1),
 ('flamenco-comedy-show', 1),
 ('flickr-show', 1),
 ('flop-show', 1),
 ('floyd-show', 1),
 ('folter-show', 1),
 ('fondue-dinner-show', 1),
 ('fontain-show', 1),
 ('foren-show', 1),
 ('forver-alone-show', 1),
 ('four-show', 1),
 ('fpö-show', 1),
 ('fpö-shows', 1),
 ('frag-mutti-show', 1),
 ('frage-antwort-show', 1),
 ('frank-elstner-show', 1),
 ('frank-schöbel-show', 1),
 ('frank-sinatra-show', 1),
 ('frankfurt-show', 1),
 ('frankreich-show', 1),
 ('frauen-latenight-show', 1),
 ('frauen-shows', 1),
 ('freaks-shows', 1),
 ('free-cam-show', 1),
 ('freeski-show', 1),
 ('freitag-nachmittag-show', 1),
 ('freitag-radio-show', 1),
 ('freitags-show', 1),
 ('freizeitpark-show', 1),
 ('fremdschäm-orf-show', 1),
 ('fremdschäm-show', 1),
 ('friyay-show', 1),
 ('frühstücks-show', 1),
 ('fulldome-show', 1),
 ('fulldome-shows', 1),
 ('funken-show', 1),
 ('fupa-show', 1),
 ('fussball-show', 1),
 ('fußball-multimedia-show', 1),
 ('fußball-strip-show', 1),
 ('föderalismus-show', 1),
 ('gabanna-show', 1),
 ('gadaffi-show', 1),
 ('gadget-show', 1),
 ('gala-shows', 1),
 ('gang-bang-show', 1),
 ('gastro-show', 1),
 ('gaza-jürgen-show', 1),
 ('gds-awards-show', 1),
 ('geissen-show', 1),
 ('geister-show', 1),
 ('geld-show', 1),
 ('genesis-show', 1),
 ('gentges-show', 1),
 ('gerd-show', 1),
 ('gesang-shows', 1),
 ('gesangs-show', 1),
 ('giga-show', 1),
 ('gina-lisa-show', 1),
 ('girlie-show', 1),
 ('girls-show', 1),
 ('give-a-show', 1),
 ('glamourotic-show', 1),
 ('gmk-show', 1),
 ('gohome-shows', 1),
 ('goodbey-show', 1),
 ('google-road-show', 1),
 ('google-show', 1),
 ('gp-show', 1),
 ('gr-show', 1),
 ('grad-show', 1),
 ('grappa-show', 1),
 ('griezmann-show', 1),
 ('griss-show', 1),
 ('group-show', 1),
 ('grubenlampen-show', 1),
 ('grundeinkommens-show', 1),
 ('grusel-show', 1),
 ('gstettenbauer-show', 1),
 ('guido-show', 1),
 ('gute-laune-show', 1),
 ('guttenberg-pr-show', 1),
 ('gysi-show', 1),
 ('götze-show', 1),
 ('hackfressen-show', 1),
 ('halbzeit-shows', 1),
 ('halbzeitpausen-show', 1),
 ('halftime-shows', 1),
 ('hallen-shows', 1),
 ('halloween-show', 1),
$$H_{L}(show) = 7.93$$$$H_{R}(show) = 5.51$$$$H_{R}(show) = 8.54$$$$H_{T}(show) = 9.04$$
  • show exhibits the most diversity in right-hand slots, slightly less diversity in left-hand slots, and the least diversity in word-internal slots

  • The total entropy for the base is 9.04

  • We compare entropy for a large number of anglicism bases

Data and methods

Source anglicism bases:

  • 3,262 most common nouns in the British National Corpus (Kilgarriff, 1997)

  • 10,000 most common nouns in the 9.6b-token ENCOW16ax corpus (Schäfer, 2015; Schäfer & Bildhauer, 2012)

  • Remove hyphenated types and types with 3 or fewer characters, convert to lower case

= 8,313 unique types

Test corpora

  • German Twitter corpus of 534m tokens (Coats, 2018)

  • Corpus of German WordPress blogs with 2.1b tokens (Barbaresi, 2016)

  • DECOW16bx corpus, a German web corpus of ~11b tokens (Schäfer, 2015; Schäfer & Bildhauer, 2012)


  • For each of the 8,313 bases, calculate frequencies and entropy as left-hand, internal, or right-hand constituents, as well as a total entropy score

  • Higher scores = more diversity of types

  • Lower scores = less diversity of types: the base may be more "fossilized", appearing with high frequencies in a few types (e.g. brand names, other named entities)

Research questions

  • Which constituent base anglicism types are most frequent in hyphenated German compounds?

  • What can morphological diversity measures such as Shannon entropy tell us about the dynamics of anglicisms borrowed into German compounds?

Results: Most frequent hyphenated compounds

In [46]:
left left_typefreq left_tokenfreq intern intern_typefreq intern_tokenfreq right right_typefreq right_tokenfreq word all all_typefreq all_tokenfreq entropy left_entropy intern_entropy right_entropy
2078 ((video-abend, 6), (video-abgabe, 1), (video-a... 1986 10464 ((action-video-clip, 1), (album-video-snippet,... 468 733 ((aaa-videos, 1), (abba-video, 1), (abba-video... 3825 18215 video ((video-abend, 6), (video-abgabe, 1), (video-a... 6279 29412 8.670288 8.721093 7.977295 6.906846
7213 ((twitter-a, 10), (twitter-abbild, 1), (twitte... 4827 26360 ((abstimmungs-twitter-filterbubbles, 1), (adle... 555 653 ((abou-chaker-twitter, 1), (afd-twitter, 2), (... 287 446 twitter ((twitter-a, 10), (twitter-abbild, 1), (twitte... 5669 27459 9.249835 8.992099 8.964237 7.576471
6715 ((facebook-a, 2), (facebook-ab, 1), (facebook-... 3229 23369 ((aal-facebook-gruppe, 1), (aber-facebook-lösc... 317 379 ((abendschau-facebook, 1), (analog-facebook, 1... 97 153 facebook ((facebook-a, 2), (facebook-ab, 1), (facebook-... 3643 23901 8.555522 8.401971 8.141200 5.977287
4279 ((team-a, 1), (team-abend, 11), (team-abendess... 849 2248 ((acommerce-team-treffen, 1), (adwords-team-in... 190 282 ((aa-team, 1), (aal-team, 1), (ab-team, 2), (a... 5424 15583 team ((team-a, 1), (team-abend, 11), (team-abendess... 6463 18113 11.023574 8.628063 6.899955 10.683781
6443 ((chef-ablösung, 1), (chef-abräumer, 1), (chef... 503 1112 ((airbus-chef-tom, 1), (ar-chef-posten, 1), (a... 56 91 ((aa-chef, 4), (aare-chef, 1), (ab-in-den-urla... 3596 16615 chef ((chef-ablösung, 1), (chef-abräumer, 1), (chef... 4155 17818 9.876002 7.812489 4.903222 9.630945
6777 ((news-abend, 1), (news-abfangjäger, 1), (news... 690 2937 ((ad-hoc-news-seite, 1), (agentur-news-geticke... 371 628 ((aal-news, 5), (aare-news, 1), (abschluss-new... 1225 13988 news ((news-abend, 1), (news-abfangjäger, 1), (news... 2286 17553 7.022636 6.980530 7.832721 5.910300
4598 ((youtube-a, 1), (youtube-abbo, 1), (youtube-a... 1458 15187 ((anthropozän-youtube-channel, 1), (anti-is-yo... 134 191 ((absurditäts-youtube, 1), (afd-youtube, 1), (... 51 57 youtube ((youtube-a, 1), (youtube-abbo, 1), (youtube-a... 1643 15435 4.870839 4.714900 6.443502 5.609120
6092 ((blog-a, 1), (blog-a-holic, 1), (blog-abc, 5)... 797 5761 ((aal-blog-beitrag, 1), (anti-blog-stimmung, 1... 132 525 ((ab-blog, 2), (aba-blog, 3), (abgeordnetenwat... 2207 9110 blog ((blog-a, 1), (blog-a-holic, 1), (blog-abc, 5)... 3136 15396 8.695120 5.782124 3.852587 8.881573
3367 ((marketing-abende, 1), (marketing-abenteuern,... 1298 5433 ((affilate-marketing-videoserie, 1), (affiliat... 687 1469 ((abakus-internet-marketing, 1), (abo-marketin... 606 7219 marketing ((marketing-abende, 1), (marketing-abenteuern,... 2591 14121 8.018634 8.636450 8.415283 4.803502
7631 ((media-a, 1), (media-abkommen, 1), (media-abl... 1141 4253 ((ag-social-media-newsletter, 1), (agentur-soc... 1948 8302 ((ag-media, 1), (agrar-media, 12), (alexa-medi... 139 1542 media ((media-a, 1), (media-abkommen, 1), (media-abl... 3228 14097 9.375089 8.340205 8.949528 2.447227
7666 ((start-a-factory, 1), (start-abbruch, 1), (st... 702 9843 ((accelerator-start-ups, 1), (adligen-start-up... 375 792 ((abend-starts, 1), (airbus-start, 1), (akadem... 676 1852 start ((start-a-factory, 1), (start-abbruch, 1), (st... 1753 12487 5.192675 3.282305 7.791666 7.955331
7018 ((account-abhängig, 2), (account-admin, 1), (a... 159 389 ((amazon-account-falle, 1), (amazon-account-sp... 96 131 ((abendblatt-account, 1), (academia-account, 2... 1632 9252 account ((account-abhängig, 2), (account-admin, 1), (a... 1887 9772 6.813946 6.180624 6.136402 6.487652
35 ((internet-ab, 3), (internet-abbrüche, 1), (in... 2399 8156 ((abakus-internet-marketing, 1), (acta-interne... 175 222 ((adsl-internet, 1), (ajax-internet, 1), (aldi... 263 1123 internet ((internet-ab, 3), (internet-abbrüche, 1), (in... 2837 9501 9.964798 9.802741 7.199312 5.937027
1867 ((google-abc, 1), (google-abfrage, 3), (google... 2132 9040 ((affiliate-google-banner, 1), (amp-google-pro... 111 159 ((adidas-google, 1), (amazon-google, 1), (anti... 45 68 google ((google-abc, 1), (google-abfrage, 3), (google... 2288 9267 8.761393 8.638905 6.380263 5.051900
2075 ((interview-abbruch, 2), (interview-abbrüche, ... 341 874 ((alba-interview-serie, 1), (architekten-inter... 45 50 ((aas-interviews, 1), (abc-interview, 2), (abc... 2218 8271 interview ((interview-abbruch, 2), (interview-abbrüche, ... 2604 9195 9.006969 7.140800 5.428758 8.668769
4515 ((event-abend, 3), (event-ablaufplan, 2), (eve... 719 2486 ((after-event-party, 3), (apple-event-schauen,... 128 164 ((aba-event, 1), (abend-event, 14), (abend-eve... 2228 6425 event ((event-abend, 3), (event-ablaufplan, 2), (eve... 3075 9075 9.970468 7.811788 6.692560 9.520586
3335 ((service-abbau, 3), (service-abdeckung, 1), (... 778 2953 ((absperr-service-mitarbeiter, 1), (alpha-serv... 342 780 ((abend-service, 2), (abfall-service, 1), (abh... 1434 5021 service ((service-abbau, 3), (service-abdeckung, 1), (... 2554 8754 9.049421 7.775919 7.090165 7.836843
584 ((shop-a-holic, 1), (shop-a-look, 1), (shop-a-... 329 989 ((animexx-shop-liste, 1), (app-shop-kontrolle,... 114 157 ((abhol-shop, 2), (abhol-shops, 1), (abo-shop,... 1317 7515 shop ((shop-a-holic, 1), (shop-a-look, 1), (shop-a-... 1760 8661 6.514037 6.676252 6.458995 5.756222
728 ((party-abend, 9), (party-abende, 1), (party-a... 821 1737 ((abi-party-fotos, 2), (abstimmungs-party-tool... 265 321 ((abba-jubiläums-party, 1), (abba-party, 1), (... 2397 6207 party ((party-abend, 9), (party-abende, 1), (party-a... 3483 8265 10.336258 8.786440 7.841839 9.613722
3642 ((check-and-balance, 1), (check-anzug, 1), (ch... 148 2867 ((accu-check-mobile-connect, 1), (after-check-... 127 247 ((ab-check, 1), (abcde-check, 1), (abend-check... 1432 4467 check ((check-and-balance, 1), (check-anzug, 1), (ch... 1707 7581 7.448860 1.956753 6.177398 9.107501
6681 ((show-abbruch, 1), (show-abend, 2), (show-abs... 374 655 ((af-ter-show-par-tyyyyy, 1), (afder-show-part... 174 332 ((abba-show, 5), (abba-tribute-show, 1), (abdo... 1874 6455 show ((show-abbruch, 1), (show-abend, 2), (show-abs... 2422 7442 9.038495 7.929430 5.510227 8.540707
5048 ((smartphone-a, 1), (smartphone-abdeckung, 1),... 1211 4756 ((android-smartphone-besitzer, 1), (android-sm... 71 98 ((acer-smartphone, 1), (ai-smartphone, 1), (al... 475 2562 smartphone ((smartphone-a, 1), (smartphone-abdeckung, 1),... 1757 7416 8.622167 8.125577 5.775538 6.691052
5188 ((update-abbruch, 1), (update-abend, 4), (upda... 374 747 ((abend-update-pushs, 1), (agb-update-emails, ... 82 90 ((abend-update, 2), (abendessen-update, 1), (a... 1704 6568 update ((update-abbruch, 1), (update-abend, 4), (upda... 2160 7405 9.243321 7.954416 6.305688 8.793558
7062 ((post-abfragen, 1), (post-abgabestelle, 1), (... 1756 3517 ((abendsonnne-im-auto-post-wanderung-selfie, 1... 143 196 ((abend-posts, 1), (abnehm-posts, 1), (abo-pos... 816 3687 post ((post-abfragen, 1), (post-abgabestelle, 1), (... 2715 7400 8.924523 9.209736 6.666251 6.465252
5903 ((mini-a, 1), (mini-a-thür, 1), (mini-abenteue... 3338 6955 ((aldi-mini-macbook, 1), (allstar-mini-projekt... 152 169 ((advents-minis, 3), (automatik-mini, 3), (bab... 108 209 mini ((mini-a, 1), (mini-a-thür, 1), (mini-abenteue... 3598 7333 10.805826 10.674379 7.164937 6.052175
6973 ((star-accelerator, 2), (star-act, 2), (star-a... 770 1861 ((all-star-album, 1), (all-star-band, 2), (all... 130 193 ((abba-star, 3), (abwehr-star, 4), (action-sta... 1264 5261 star ((star-accelerator, 2), (star-act, 2), (star-a... 2164 7315 9.267195 8.475805 6.260323 8.290992
5 ((test-abbilder, 1), (test-abbruch, 1), (test-... 597 1525 ((ab-test-learnings, 1), (admob-test-tag, 1), ... 122 131 ((aargauer-test, 2), (ab-test, 4), (ab-tests, ... 2152 5571 test ((test-abbilder, 1), (test-abbruch, 1), (test-... 2871 7227 9.942774 7.366455 6.884493 9.593985
6484 ((ticket-a, 1), (ticket-abholstellen, 1), (tic... 399 1263 ((auswärtskarten-ticket-kontingent, 1), (bahn-... 92 108 ((ab-ticket, 4), (ab-tickets, 3), (abc-ticket,... 1630 5650 ticket ((ticket-a, 1), (ticket-abholstellen, 1), (tic... 2121 7021 9.299458 7.066459 6.235390 8.875424
1346 ((tour-abbruch, 1), (tour-abend, 1), (tour-abn... 369 895 ((after-tour-party, 1), (after-tour-schlaf, 2)... 103 120 ((aa-tour, 1), (aaretrail-tour, 1), (aas-tour,... 2143 5832 tour ((tour-abbruch, 1), (tour-abend, 1), (tour-abn... 2615 6847 9.974095 7.457568 6.583933 9.628022
4242 ((software-abo, 4), (software-abo-modell, 1), ... 765 3506 ((app-software-entwicklung, 1), (aps-software-... 135 159 ((aaa-software, 1), (abas-business-software, 4... 985 3040 software ((software-abo, 4), (software-abo-modell, 1), ... 1885 6705 8.971620 7.280220 6.966848 8.524847
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
8303 ((time-abo, 1), (time-anruf, 1), (time-bare, 1... 180 485 ((adventure-time-ugly-christmas-sweater, 1), (... 282 522 ((aaaaacht-time, 1), (action-time, 1), (advent... 762 2208 time ((time-abo, 1), (time-anruf, 1), (time-bare, 1... 1224 3215 8.440877 5.929423 7.308391 7.498760
1850 ((session-absage, 1), (session-anbietende, 1),... 247 592 ((auth-session-handling, 1), (barcamp-parallel... 18 21 ((abbau-session, 2), (abend-session, 10), (abe... 1289 2593 session ((session-absage, 1), (session-anbietende, 1),... 1554 3206 9.588130 6.829880 4.070656 9.341232
5788 ((server-abschaltung, 1), (server-absturz, 4),... 365 816 ((akk-server-notebook, 1), (allround-server-pa... 87 93 ((aasport-server, 1), (access-server, 1), (ad-... 817 2291 server ((server-abschaltung, 1), (server-absturz, 4),... 1269 3200 9.054694 7.664067 6.393892 8.266525
1682 ((marathon-abenteuer, 1), (marathon-abzeichen,... 333 729 ((after-marathon-drink, 1), (after-marathon-lo... 67 76 ((ab-marathon, 1), (abitur-marathon, 1), (abo-... 919 2282 marathon ((marathon-abenteuer, 1), (marathon-abzeichen,... 1319 3087 9.152338 7.475775 5.937274 8.515938
3287 ((modus-flugstrategie, 1), (modus-mail, 1), (m... 7 8 ((arcade-modus-character, 1), (aufwärm-modus-m... 10 10 ((abba-modus, 2), (abenteuer-modus, 1), (aber-... 1456 3066 modus ((modus-flugstrategie, 1), (modus-mail, 1), (m... 1473 3084 9.447236 2.750000 3.321928 9.426860
2374 ((museum-adventskalender, 2), (museum-angestel... 213 365 ((colani-museum-antrag, 1), (dresdener-museums... 17 19 ((aartal-museum, 1), (abba-museum, 7), (adam-r... 952 2699 museum ((museum-adventskalender, 2), (museum-angestel... 1182 3083 9.031410 7.064461 4.037401 8.672667
5504 ((special-abilities, 1), (special-abschiedszug... 242 436 ((advent-special-webcast, 1), (apple-special-e... 65 77 ((abba-special, 1), (abbonenten-special, 1), (... 935 2555 special ((special-abilities, 1), (special-abschiedszug... 1242 3068 8.718218 7.089012 5.720634 8.182031
1595 ((land-act-staudamm, 1), (land-adel, 1), (land... 356 631 ((afd-landes-chef, 1), (auf-dem-land-leben-vor... 114 239 ((aaa-land, 1), (aachen-land, 3), (ab-land, 1)... 919 2142 land ((land-act-staudamm, 1), (land-adel, 1), (land... 1389 3012 9.278131 7.611736 5.480372 8.628828
5073 ((problem-aber, 1), (problem-achterbahn, 1), (... 162 221 ((abendgarderobe-kein-problem-mensch, 1), (akk... 22 22 ((aaa-problem, 3), (abc-problem, 1), (abfrage-... 1588 2744 problem ((problem-aber, 1), (problem-achterbahn, 1), (... 1772 2987 10.089361 7.070712 4.459432 9.895845
5631 ((website-absprungraten, 1), (website-account,... 396 1274 ((agentur-website-problem, 1), (bad-website-sk... 29 41 ((ab-website, 1), (aba-website, 1), (abc-websi... 919 1587 website ((website-absprungraten, 1), (website-account,... 1344 2902 9.211947 6.912290 4.509872 9.196580
7280 ((award-abend, 4), (award-abräumer, 1), (award... 158 352 ((academy-award-futter, 1), (architektur-award... 68 74 ((aal-award, 1), (abi-awards, 1), (acadamy-awa... 1130 2470 award ((award-abend, 4), (award-abräumer, 1), (award... 1356 2896 9.570063 6.298914 6.047291 9.320438
1534 ((city-abklatsch, 1), (city-ableger, 1), (city... 542 1607 ((airport-city-day-ticket, 1), (bay-city-rolle... 112 173 ((ac-city, 1), (afc-city, 1), (airport-city, 2... 382 1019 city ((city-abklatsch, 1), (city-ableger, 1), (city... 1036 2799 8.740469 7.650724 6.361619 7.460839
1675 ((power-ableger, 1), (power-achter, 1), (power... 691 1645 ((ac-power-markt, 1), (anti-power-point-partei... 159 172 ((abb-power, 1), (abo-power, 1), (abspritz-pow... 560 953 power ((power-ableger, 1), (power-achter, 1), (power... 1410 2770 9.460823 8.340854 7.266324 8.229437
1122 ((highlight-abend, 1), (highlight-adressen, 1)... 200 443 ((animation-highlights-präsentationen, 1), (ce... 20 20 ((abend-highlight, 1), (abschluss-highlight, 4... 857 2257 highlight ((highlight-abend, 1), (highlight-adressen, 1)... 1077 2720 8.799028 6.367892 4.321928 8.469974
3983 ((deal-aktion, 1), (deal-aktionen, 1), (deal-a... 63 90 ((close-this-deal-now, 1), (daily-deal-mann, 1... 17 18 ((abflussfee-deal, 1), (accor-fairmont-deal, 1... 1008 2552 deal ((deal-aktion, 1), (deal-aktionen, 1), (deal-a... 1088 2660 8.748959 5.696190 4.058814 8.606778
2333 ((support-abrechnungstool, 1), (support-abteil... 318 992 ((amazon-support-chat, 2), (apple-id-aussperru... 82 95 ((abi-support, 1), (acer-support, 1), (actives... 684 1523 support ((support-abrechnungstool, 1), (support-abteil... 1084 2610 8.714356 6.750285 6.259226 8.162452
7582 ((alarm-abbruch, 1), (alarm-abos, 1), (alarm-a... 52 72 ((amok-alarm-system, 1), (arschloch-alarm-app,... 18 21 ((abc-alarm, 20), (abc-alarms, 1), (ablöse-ala... 962 2490 alarm ((alarm-abbruch, 1), (alarm-abos, 1), (alarm-a... 1032 2583 8.352629 5.314594 4.070656 8.215788
2555 ((style-abc, 1), (style-action, 1), (style-akr... 185 463 ((afd-style-verknappung, 1), (alpen-style-burg... 84 91 ((abathur-style, 1), (abb-style, 1), (abranz-s... 1295 1996 style ((style-abc, 1), (style-action, 1), (style-akr... 1564 2550 9.805696 6.132223 6.331970 9.672612
5782 ((winter-abc, 1), (winter-abend, 2), (winter-a... 989 2017 ((alfa-winter-foto, 1), (alfa-winter-fotos, 1)... 133 185 ((abfahrt-winter, 1), (aktiv-winter, 1), (alib... 122 345 winter ((winter-abc, 1), (winter-abend, 2), (winter-a... 1244 2547 9.361418 9.129133 6.669673 5.282222
7941 ((album-abschlusskonzert, 1), (album-acappella... 143 397 ((bestmusictalent-album-tipp, 4), (bestmusicta... 21 25 ((abalonia-album, 1), (abdi-album, 2), (acoust... 959 2116 album ((album-abschlusskonzert, 1), (album-acappella... 1123 2538 8.937372 5.992747 4.243856 8.702028
5751 ((multi-abgeordneten-account, 1), (multi-acade... 994 2437 ((anti-multi-kulti-hetzer, 1), (aria-multi-con... 29 30 ((agro-multis, 1), (biotech-multi, 1), (biotec... 43 56 multi ((multi-abgeordneten-account, 1), (multi-acade... 1066 2523 8.413525 8.276048 4.840224 5.214153
4371 ((shirt-abos, 1), (shirt-aktion, 4), (shirt-al... 72 100 ((achilles-buch-shirt-paket, 1), (audrey-t-shi... 70 92 ((aarau-shirt, 1), (abba-shirt, 1), (abercromb... 1608 2329 shirt ((shirt-abos, 1), (shirt-aktion, 4), (shirt-al... 1750 2521 10.367522 5.892662 5.653997 10.242985
5947 ((standard-a, 4), (standard-abc, 1), (standard... 1019 1593 ((android-standard-browser, 1), (business-inte... 64 67 ((ac-standard, 2), (accessibility-standards, 1... 440 760 standard ((standard-a, 4), (standard-abc, 1), (standard... 1523 2420 9.989035 9.412138 5.965270 8.161374
413 ((premium-a, 2), (premium-abenteuer, 2), (prem... 899 2177 ((aka-aki-premium-bezahl-account, 2), (apple-p... 75 114 ((anti-premium, 1), (bf-premium, 1), (cartoon-... 41 102 premium ((premium-a, 2), (premium-abenteuer, 2), (prem... 1015 2393 8.762617 8.535340 5.829919 4.517685
3926 ((mode-abc, 7), (mode-abstimmung, 1), (mode-ab... 596 1258 ((anti-mode-mode, 1), (anti-mode-terror, 1), (... 76 104 ((ackermann-mode, 1), (activetrack-mode, 1), (... 490 895 mode ((mode-abc, 7), (mode-abstimmung, 1), (mode-ab... 1162 2257 9.525621 8.511114 5.961076 8.330156
3466 ((logo-abs, 1), (logo-abstimming, 1), (logo-ab... 238 490 ((app-logo-quiz, 1), (apple-logo-tatoo, 2), (a... 30 34 ((abi-logo, 1), (absolventa-logo, 1), (adac-lo... 950 1714 logo ((logo-abs, 1), (logo-abstimming, 1), (logo-ab... 1218 2238 9.443094 7.114346 4.829966 9.069214
701 ((baby-a-a, 1), (baby-aa, 1), (baby-aale, 1), ... 867 1489 ((after-baby-body, 7), (after-baby-bodys, 1), ... 49 88 ((abendbuffet-baby, 1), (abstiegskampf-baby, 1... 359 577 baby ((baby-a-a, 1), (baby-aa, 1), (baby-aale, 1), ... 1275 2154 9.732625 9.157437 4.372483 8.055854
6566 ((band-abend, 1), (band-achat-perlen, 2), (ban... 188 332 ((alltime-greatest-band-eve, 1), (bestmusictal... 68 73 ((abi-band, 1), (abklatsch-band, 1), (acapella... 830 1656 band ((band-abend, 1), (band-achat-perlen, 2), (ban... 1086 2061 9.201629 6.864101 6.042497 8.753347
5642 ((text-account, 1), (text-ad, 1), (text-ads, 3... 397 874 ((appetit-text-häppchen, 1), (appstore-text-op... 85 146 ((ab-text, 1), (abo-text, 2), (about-text, 3),... 638 1040 text ((text-account, 1), (text-ad, 1), (text-ads, 3... 1120 2060 9.077875 7.063622 5.201136 8.753219
7127 ((nerd-abend, 3), (nerd-abteilung, 1), (nerd-a... 550 936 ((akademiker-nerd-bonus, 1), (angeber-nerd-wis... 62 64 ((abi-jahrgangstufen-nerd, 1), (aldi-nerd, 1),... 433 729 nerd ((nerd-abend, 3), (nerd-abteilung, 1), (nerd-a... 1045 1729 9.462777 8.531720 5.937500 8.167483

111 rows × 17 columns

In [52]:
left left_typefreq left_tokenfreq intern intern_typefreq intern_tokenfreq right right_typefreq right_tokenfreq word all all_typefreq all_tokenfreq entropy left_entropy intern_entropy right_entropy
6092 ((blog-a-ward, 1), (blog-account, 13), (blog-a... 885 7294 ((0815-blog-template, 1), (14-tage-blog-fitmac... 1184 2033 ((04-blogs, 1), (05-blog, 2), (0815-blog, 7), ... 12005 37997 blog ((blog-a-ward, 1), (blog-account, 13), (blog-a... 14074 47324 11.835847 6.862641 9.068396 11.861069
584 ((shop-accounts, 1), (shop-administrator, 2), ... 170 494 ((100-yen-shop-fieber, 1), (250-shops-mall, 1)... 356 546 ((02-shop, 1), (02-shops, 2), (100-en-shop, 4)... 3818 38231 shop ((shop-accounts, 1), (shop-administrator, 2), ... 4344 39271 5.176090 6.288191 8.023523 4.912667
4279 ((team-412-stand, 2), (team-6-gate, 1), (team-... 346 1290 ((2000-meter-team-staffel, 1), (2er-team-aufga... 360 543 ((007-team, 2), (05er-team, 1), (09-team, 2), ... 11048 29533 team ((team-412-stand, 2), (team-6-gate, 1), (team-... 11754 31366 12.100307 7.159381 7.970333 11.996497
2985 ((system-a-new-dimension-in-endodontics, 1), (... 256 799 ((10-finger-system-schreibtrainer, 2), (10-ste... 213 314 ((0x-system, 1), (10-20-system, 1), (10-blindf... 8680 25415 system ((system-a-new-dimension-in-endodontics, 1), (... 9149 26528 11.625423 6.814389 7.297087 11.530372
2078 ((video-8-player, 1), (video-abstracts, 1), (v... 859 5458 ((1080p-video-aufnahmen, 2), (1080p-video-unte... 843 1118 ((06-07-video, 1), (0815-videos, 1), (10-bit-v... 5280 16668 video ((video-8-player, 1), (video-abstracts, 1), (v... 6982 23244 10.264906 7.311496 9.491506 9.825952
6443 ((chef-admin, 2), (chef-administrator, 1), (ch... 128 457 ((15-minuten-chef-gefühl, 1), (ace-chef-jurist... 138 174 ((007-chef, 1), (05-chef, 2), (19-punkte-chef,... 4696 21058 chef ((chef-admin, 2), (chef-administrator, 1), (ch... 4962 21689 9.938513 5.405011 6.893611 9.841089
3225 ((version-administration, 1), (version-check, ... 26 36 ((android-version-jelly, 1), (bewertungsbogen-... 24 30 ((02-pdf-version, 1), (0711-version, 1), (07er... 5211 20452 version ((version-administration, 1), (version-check, ... 5261 20518 9.760410 4.503258 4.506891 9.742793
3335 ((service-account, 9), (service-accounts, 7), ... 348 1947 ((0180-service-nummer, 1), (0800-service-hotli... 666 1197 ((04-service, 2), (10-gratis-sms-service, 1), ... 2676 16207 service ((service-account, 9), (service-accounts, 7), ... 3690 19351 6.980357 6.193502 8.536443 6.009625
3517 ((film-1-soundtrack, 2), (film-a-film-unfinshe... 495 2486 ((100-top-film-liste, 1), (16-millimeter-film-... 646 785 ((007-film, 3), (007-films, 3), (007-pierce-br... 4510 16002 film ((film-1-soundtrack, 2), (film-a-film-unfinshe... 5651 19273 10.517818 6.521741 9.182178 10.250267
6950 ((dollar-a-month, 1), (dollar-a-year, 1), (dol... 95 263 ((000-dollar-angebot, 1), (000-dollar-frage, 1... 932 1478 ((10-dollar, 1), (10-milliarden-dollar, 1), (1... 382 15926 dollar ((dollar-a-month, 1), (dollar-a-year, 1), (dol... 1409 17667 2.359112 5.475742 9.425207 1.069794
6681 ((show-adapter, 1), (show-agency-group, 4), (s... 206 547 ((acappella-show-rock, 2), (after-show-absacke... 417 800 ((10-minuten-show, 1), (100-sekunden-chart-sho... 3840 15130 show ((show-adapter, 1), (show-agency-group, 4), (s... 4463 16477 9.477024 6.708962 6.519493 9.202061
728 ((party-action-adrenalin-city, 1), (party-agen... 435 1208 ((06-we-b-dooinit-acappella-party-1, 1), (1st-... 769 1205 ((00er-party, 1), (0815-party, 1), (0815-party... 4298 12674 party ((party-action-adrenalin-city, 1), (party-agen... 5502 15087 10.588637 7.839408 8.394676 10.113977
1346 ((tour-agency, 1), (tour-agenda, 1), (tour-alb... 198 701 ((20-jahre-pfok-tour-in-england, 1), (2000-tou... 252 311 ((000-kilometer-tour, 1), (07-hanoi-tour, 2), ... 4691 13546 tour ((tour-agency, 1), (tour-agenda, 1), (tour-alb... 5141 14558 10.547958 5.921411 7.788490 10.392927
2763 ((euro-08-chef, 1), (euro-08-host-city, 1), (e... 666 7661 ((000-euro-preisgeld, 1), (10-euro-abgabe, 1),... 1946 5356 ((10-euro, 1), (10-milliarden-euro, 1), (100-e... 386 1180 euro ((euro-08-chef, 1), (euro-08-host-city, 1), (e... 2998 14197 7.514866 4.479499 8.496801 7.014356
6715 ((facebook-account, 2278), (facebook-account-b... 771 13128 ((100-facebook-fans-gewinnspiels, 1), (100-fac... 554 786 ((4lyn-facebook, 1), (5000-facebook, 1), (abba... 150 199 facebook ((facebook-account, 2278), (facebook-account-b... 1475 14113 6.115341 5.500648 8.680537 7.044713
5 ((test-a-mint, 1), (test-account, 25), (test-a... 335 906 ((11-gang-test-menü, 1), (14-tage-test-version... 315 422 ((10-km-test, 1), (10-stunden-test, 2), (10-te... 3937 11613 test ((test-a-mint, 1), (test-account, 25), (test-a... 4587 12941 10.328446 7.513457 8.092007 9.994368
4515 ((event-abenteuer-workshop-wasweissich-wochene... 280 972 ((15206-event-mouse, 1), (8042-serio-1-event-m... 242 332 ((0815-events, 1), (10-day-event, 1), (10-even... 3630 11584 event ((event-abenteuer-workshop-wasweissich-wochene... 4152 12888 10.322990 6.776658 7.701960 10.077604
7018 ((account-administration, 1), (account-agent, ... 38 79 ((auto-account-assistent, 1), (bank-account-er... 78 133 ((0xp-accounts, 1), (10-tage-probe-account, 1)... 1634 12060 account ((account-administration, 1), (account-agent, ... 1750 12272 6.323315 4.577480 5.566043 6.198289
1876 ((forum-account, 1), (forum-accounts, 3), (for... 105 257 ((20er-jahre-forum-gebäude, 1), (ads-forum-kop... 105 141 ((04-tausend-freunde-forum, 1), (08-fuffzehn-f... 4096 11701 forum ((forum-account, 1), (forum-accounts, 3), (for... 4306 12099 10.711520 5.807357 6.476545 10.622559
6566 ((band-1-teil-1-2004-441-s-text, 1), (band-1-t... 230 672 ((10-band-equalizer, 5), (100-band-equalizer, ... 323 430 ((02-band, 1), (03-band, 1), (0815-band, 1), (... 3342 10957 band ((band-1-teil-1-2004-441-s-text, 1), (band-1-t... 3895 12059 10.081007 6.802394 8.021230 9.780440
2096 ((club-08-10-cable, 1), (club-08-10-cable-2011... 294 780 ((1000-kumpel-und-malocher-club-freunden, 1), ... 435 601 ((04-club, 1), (089-club, 1), (100-clubs, 1), ... 3041 10097 club ((club-08-10-cable, 1), (club-08-10-cable-2011... 3770 11478 10.361785 6.933252 8.502558 9.999397
4154 ((regime-chance, 1), (regime-change, 176), (re... 69 342 ((anti-regime-bewegung, 3), (anti-regime-blogg... 51 71 ((14-regime, 1), (abbadon-regime, 1), (abbas-r... 1085 10885 regime ((regime-chance, 1), (regime-change, 176), (re... 1205 11298 6.961636 3.581080 5.526301 6.817240
4242 ((software-administrator, 1), (software-agent,... 354 1968 ((140506_permira-funds-to-acquire-leading-soft... 407 579 ((16bit-software, 1), (20-euro-software, 1), (... 2268 8542 software ((software-administrator, 1), (software-agent,... 3029 11089 9.603950 6.218432 8.405681 9.225295
35 ((internet-abc-module, 1), (internet-access, 1... 1150 8425 ((0815-internet-user, 2), (10-minuten-internet... 661 980 ((3d-internet, 11), (3d-internets, 2), (3g-int... 374 1254 internet ((internet-abc-module, 1), (internet-access, 1... 2185 10659 8.466999 7.435192 8.948822 6.964626
3367 ((marketing-accounts, 2), (marketing-addict, 1... 325 1744 ((10-gehalt-marketing-branche, 1), (360-grad-m... 818 1419 ((1a-marketing, 1), (2011_location-based-marke... 958 7456 marketing ((marketing-accounts, 2), (marketing-addict, 1... 2101 10619 7.878334 6.670314 9.093224 6.257288
6989 ((code-60-phase, 2), (code-academy, 1), (code-... 144 387 ((20-zeilen-code-patch, 1), (analyzing-hadoop-... 227 355 ((10-punkte-code, 1), (13-mann-code, 1), (15-e... 1545 9508 code ((code-60-phase, 2), (code-academy, 1), (code-... 1916 10250 7.247052 6.046257 7.347304 6.810221
6777 ((news-account, 3), (news-activity, 2), (news-... 261 966 ((24-stunden-news-zyklus, 1), (24h-news-cycle,... 412 742 ((14-uhr-news, 1), (18599-news, 2), (1904-news... 1641 8180 news ((news-account, 3), (news-activity, 2), (news-... 2314 9888 8.698966 6.544295 7.511009 8.052422
7941 ((album-abschluss-song, 1), (album-album, 8), ... 109 883 ((40-album-charts, 2), (amazon-album-charts, 1... 84 126 ((0815-album, 2), (10-album, 3), (10-punkte-al... 2645 8774 album ((album-abschluss-song, 1), (album-album, 8), ... 2838 9783 9.050583 4.641614 6.130224 8.939829
7631 ((media-access-control, 2), (media-access-cont... 438 1177 ((4stage-media-group, 1), (50-smart-ways-to-cr... 1768 6594 ((3h-media, 5), (3rze-media, 1), (60-prozent-d... 346 1992 media ((media-access-control, 2), (media-access-cont... 2552 9763 8.840740 7.632080 8.678836 4.120127
2075 ((interview-activity, 1), (interview-archive, ... 101 479 ((2015-interview-walker, 1), (_dlf-interview-u... 75 89 ((11-fragen-interview, 7), (11-freunde-intervi... 2242 9055 interview ((interview-activity, 1), (interview-archive, ... 2418 9623 8.952230 4.852183 6.130173 8.813707
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
6700 ((line-album, 2), (line-arena, 1), (line-array... 108 281 ((1st-line-therapie, 1), (360-line-up, 2), (3d... 289 398 ((10-yd-line, 1), (1000-line, 2), (10yds-line,... 824 1942 line ((line-album, 2), (line-arena, 1), (line-array... 1221 2621 9.376426 5.272676 7.955038 8.805520
6137 ((freak-a-leak, 1), (freak-accident, 1), (frea... 78 211 ((akte-x-freak-studentenverbindungen, 1), (bie... 42 56 ((30er-freaks, 1), (60er-jahre-freaks, 1), (64... 1143 2350 freak ((freak-a-leak, 1), (freak-accident, 1), (frea... 1263 2617 9.275033 4.655760 5.070137 9.176404
2411 ((minute-by-minute, 4), (minute-chance, 2), (m... 24 38 ((10-minute-mail, 1), (10-minute-mails, 1), (1... 794 1768 ((10-minute, 2), (10-minutes, 1), (10-to-20-mi... 193 771 minute ((minute-by-minute, 4), (minute-chance, 2), (m... 1011 2577 8.163729 4.333325 8.504806 4.283174
1615 ((crew-album, 1), (crew-bashing, 1), (crew-bas... 51 89 ((ballon-crew-sachsen, 1), (cabin-crew-leader,... 28 32 ((08xx-crew, 1), (101-crew, 1), (10er-crew, 1)... 1327 2456 crew ((crew-album, 1), (crew-bashing, 1), (crew-bas... 1406 2577 9.685900 5.335025 4.750000 9.580040
2624 ((pause-blog, 2), (pause-button, 15), (pause-b... 18 138 ((15-minuten-geh-pause-an-der-frischen-luft, 1... 33 39 ((10-jahres-pause, 2), (10-minuten-pause, 8), ... 1016 2381 pause ((pause-blog, 2), (pause-button, 15), (pause-b... 1067 2558 8.500896 1.925452 4.926428 8.494105
529 ((drama-action, 1), (drama-anime, 2), (drama-a... 87 270 ((16-episoden-drama-verwandten, 1), (abenteuer... 100 121 ((007-drama, 1), (1999-drama, 1), (1999er-dram... 910 2141 drama ((drama-action, 1), (drama-anime, 2), (drama-a... 1097 2532 8.949142 4.240389 6.507504 8.783774
2903 ((clan-action, 1), (clan-album, 2), (clan-bloc... 51 119 ((aggro-clan-libanese, 1), (bush-clan-mitglied... 36 40 ((abels-clan, 1), (aborigenese-clans, 1), (abo... 1122 2361 clan ((clan-action, 1), (clan-album, 2), (clan-bloc... 1209 2520 9.309541 4.693655 5.121928 9.195855
6780 ((feature-album, 1), (feature-artist, 1), (fea... 75 162 ((30-minuten-feature-produktion, 1), (az-obama... 34 38 ((14-features, 1), (1password-features, 1), (2... 1055 2291 feature ((feature-album, 1), (feature-artist, 1), (fea... 1164 2491 9.247516 5.885504 5.017536 9.055777
6984 ((bashing-blog, 3), (bashing-festival, 6), (ba... 12 20 ((adhs-bashing-artikel, 1), (banken-bashing-fi... 42 54 ((4e-bashing, 2), (68er-bashing, 8), (abendbro... 999 2399 bashing ((bashing-blog, 3), (bashing-festival, 6), (ba... 1053 2473 9.085037 3.208695 4.976114 9.000558
4996 ((newsletter-account, 1), (newsletter-ankündig... 61 195 ((ahk-newsletter-team, 1), (attacke-newsletter... 30 50 ((10-minuten-internet-newsletter, 1), (151-new... 959 2218 newsletter ((newsletter-account, 1), (newsletter-ankündig... 1050 2463 8.926205 4.903945 4.176691 8.787320
1236 ((world-adapter, 1), (world-aids-day, 1), (wor... 438 812 ((1st-world-menschen, 1), (1st-world-problems,... 468 703 ((02-world, 6), (20-awesomely-untranslatable-w... 495 943 world ((world-adapter, 1), (world-aids-day, 1), (wor... 1401 2458 9.893448 8.191467 8.427592 8.347394
4472 ((hype-alarm, 2), (hype-anime, 2), (hype-auto,... 60 107 ((airplay-hype-bands, 1), (anti-hype-blog, 9),... 62 88 ((2012-hype, 9), (2012-weltuntergangs-hype, 1)... 1128 2231 hype ((hype-alarm, 2), (hype-anime, 2), (hype-auto,... 1250 2426 9.397184 5.266193 5.659390 9.217154
2333 ((support-account, 1), (support-accounts, 3), ... 134 625 ((20-minuten-support-boykott, 2), (24-stunden-... 142 209 ((1080p-support, 1), (1a-support, 2), (1st-lev... 727 1568 support ((support-account, 1), (support-accounts, 3), ... 1003 2402 8.786066 5.369020 6.626067 8.576952
1918 ((block-8-land, 1), (block-abfertigungs-plan, ... 88 190 ((action-block-buster, 1), (ad-block-plus, 5),... 89 124 ((02-block, 1), (0²-block, 1), (1000er-block, ... 1115 2070 block ((block-8-land, 1), (block-abfertigungs-plan, ... 1292 2384 9.715434 5.813593 6.233565 9.487949
8292 ((model-2-model-transformation, 2), (model-act... 130 370 ((alles-ist-schön-top-model-shows-gesellschaft... 120 143 ((36-model, 1), (36minus-model, 1), (3d-city-m... 766 1833 model ((model-2-model-transformation, 2), (model-act... 1016 2346 8.690716 6.117171 6.808370 8.148314
4799 ((gang-action, 4), (gang-adventure, 1), (gang-... 73 167 ((10-gang-doppelkupplungsgetriebe, 1), (10-gan... 230 423 ((10-20-minuten-pipi-gang, 1), (10-gang, 3), (... 991 1749 gang ((gang-action, 4), (gang-adventure, 1), (gang-... 1294 2339 9.679775 5.265745 7.033454 9.361611
7346 ((self-abandonment, 3), (self-ability, 1), (se... 872 2105 ((99-cent-self-publisher-story, 1), (abwesenhe... 78 101 ((agent-self, 1), (alligator-minigolf-self, 1)... 66 116 self ((self-abandonment, 3), (self-ability, 1), (se... 1016 2322 9.331608 9.084620 6.083796 5.811567
6898 ((card-adapter, 1), (card-barometer, 3), (card... 73 124 ((abo-card-besitzer, 2), (abo-card-inhaber, 1)... 225 335 ((100-card, 1), (16gb-sd-card, 1), (1gb-sd-car... 959 1855 card ((card-adapter, 1), (card-barometer, 3), (card... 1257 2314 9.735530 5.929248 7.486321 9.291442
6255 ((junkie-bassist, 1), (junkie-bitch, 1), (junk... 19 19 ((anti-junkie-schwarzlicht, 1), (baby-robbe-un... 21 21 ((140-zeichen-junkie, 1), (24-jack-bauer-junki... 1049 2252 junkie ((junkie-bassist, 1), (junkie-bitch, 1), (junk... 1089 2292 9.133431 4.247928 4.392317 9.071992
5620 ((ring-a-buzz, 2), (ring-a-ring-a-roses, 1), (... 98 184 ((alfred-nobel-ring-anlieger, 1), (am-ring-fan... 112 158 ((12-tonnen-wurmloch-ring, 1), (148er-ring, 1)... 995 1935 ring ((ring-a-buzz, 2), (ring-a-ring-a-roses, 1), (... 1205 2277 9.631400 6.062894 6.620400 9.322376
3983 ((deal-administration, 1), (deal-alert, 1), (d... 25 45 ((ab-in-den-urlaub-deal-gutschein, 2), (adress... 39 43 ((10-milliarden-deal, 1), (100-millionen-dolla... 951 2155 deal ((deal-administration, 1), (deal-alert, 1), (d... 1015 2243 9.014190 4.227545 5.240218 8.900112
5699 ((lager-abbau-revival, 1), (lager-basis, 1), (... 23 41 ((42500-lager-in-der-ns-zeit-niemand-konnte-we... 43 54 ((007-lager, 1), (1000-mann-lager, 1), (16er-l... 1034 2087 lager ((lager-abbau-revival, 1), (lager-basis, 1), (... 1100 2182 9.295775 4.236231 5.245447 9.185026
5563 ((pack-a-bowl, 1), (pack-a-punch, 5), (pack-a-... 35 68 ((10er-pack-stempel, 1), (5er-pack-wanderung, ... 108 126 ((000er-pack, 1), (10-er-pack, 1), (10-er-pack... 910 1985 pack ((pack-a-bowl, 1), (pack-a-punch, 5), (pack-a-... 1053 2179 9.191108 4.629249 6.641054 8.942346
7785 ((house-affair, 1), (house-album, 6), (house-a... 219 514 ((90er-french-house-generation, 1), (90er-hous... 340 423 ((100-mile-house, 7), (105-mile-house, 1), (10... 545 1191 house ((house-affair, 1), (house-album, 6), (house-a... 1104 2128 9.310220 6.917878 8.274387 8.160874
8016 ((boot-an-land-schleppen, 1), (boot-animation,... 160 389 ((140418-boot-beschriftung-svea, 1), (140418-b... 174 223 ((10-loch-boots, 1), (10-meter-bootes, 1), (10... 728 1494 boot ((boot-an-land-schleppen, 1), (boot-animation,... 1062 2106 9.026800 6.254062 7.214296 8.405997
335 ((book-a-cook, 2), (book-a-cook-auftrag, 1), (... 202 538 ((13-books-store, 1), (amazon-book-shit, 1), (... 210 279 ((10-book, 1), (62-seiten-flip-book, 1), (75e-... 617 1289 book ((book-a-cook, 2), (book-a-cook-auftrag, 1), (... 1029 2106 9.111032 6.381152 7.548933 8.427341
235 ((school-administration, 1), (school-aged, 6),... 153 241 ((70er-jahre-high-school-portrait, 1), (80er-j... 653 994 ((3d-schools, 2), (93-and-school, 1), (after-s... 293 771 school ((school-administration, 1), (school-aged, 6),... 1099 2006 9.009722 6.944610 8.983356 6.048083
3346 ((trick-animation, 1), (trick-bike, 1), (trick... 48 69 ((accessoires-trick-kiste, 1), (alliterations-... 42 64 ((007-tricks, 1), (3d-tricks, 1), (51yds-trick... 993 1695 trick ((trick-animation, 1), (trick-bike, 1), (trick... 1083 1828 9.440248 5.403754 4.831051 9.294571
7823 ((front-airbag, 2), (front-airbags, 2), (front... 136 351 ((agnostic-front-cover, 1), (agnostic-front-le... 64 78 ((18er-front, 1), (4er-front, 1), (50mm-front,... 875 1344 front ((front-airbag, 2), (front-airbags, 2), (front... 1075 1773 9.571996 6.187834 5.916750 9.396498
3784 ((tradition-blog, 1), (tradition-bound, 9), (t... 41 65 ((anti-traditions-muslim, 1), (anti-traditions... 5 5 ((100-tage-tradition, 1), (1940-er-noir-tradit... 1094 1702 tradition ((tradition-blog, 1), (tradition-bound, 9), (t... 1140 1772 9.781913 5.013924 2.321928 9.720770

211 rows × 17 columns

In [51]:
left left_typefreq left_tokenfreq intern intern_typefreq intern_tokenfreq right right_typefreq right_tokenfreq word all all_typefreq all_tokenfreq entropy left_entropy intern_entropy right_entropy
2985 ((system-a, 4), (system-abbau, 2), (system-abb... 5688 31926 ((aaorrac-system-pwf, 1), (abas-system-bowl, 1... 2588 5580 ((aa-system, 8), (aaa-core-system, 1), (aaa-sy... 68952 544725 system ((system-a, 4), (system-abbau, 2), (system-abb... 77228 582231 12.221675 9.911026 9.679466 11.972890
35 ((internet-a, 3), (internet-a-gps, 6), (intern... 23427 487181 ((abakus-internet-forum, 1), (abakus-internet-... 4045 8641 ((abakus-internet, 1), (abendland-internet, 1)... 1301 11251 internet ((internet-a, 3), (internet-a-gps, 6), (intern... 28773 507073 9.152062 8.902414 10.616467 6.331341
1876 ((forum-a, 1), (forum-ab, 3), (forum-abc, 1), ... 5836 39116 ((abends-im-forum-die-zeit-tot-schlagen, 2), (... 1920 2964 ((aa-europa-forum, 1), (aa-fan-forum, 1), (aa-... 26808 370759 forum ((forum-a, 1), (forum-ab, 3), (forum-abc, 1), ... 34564 412839 10.634618 8.964246 10.272242 10.243028
6989 ((code-a, 6), (code-a-disc, 2), (code-a-finale... 2542 18719 ((abap-code-analyse, 1), (abap-code-projekt, 1... 1474 2705 ((aa-code, 6), (aaa-code, 6), (aacs-code, 2), ... 7506 377274 code ((code-a, 6), (code-a-disc, 2), (code-a-finale... 11522 398698 4.174569 8.461761 9.611394 3.572597
4279 ((team-a, 5), (team-ab, 1), (team-ab-racing, 3... 5313 37937 ((aal-team-hh, 1), (abenteuer-team-aufgabe, 1)... 2518 4918 ((aa-team, 13), (aaa-dj-team, 2), (aaa-info-te... 54772 316350 team ((team-a, 5), (team-ab, 1), (team-ab-racing, 3... 62603 359205 12.848199 9.312728 10.359035 12.642428
584 ((shop-a, 1), (shop-a-gogo, 1), (shop-a-holic,... 2268 38833 ((abo-shop-site, 2), (actionfiguren-shop-seite... 2187 6751 ((aa-shop, 13), (aaa-shop, 1), (aafes-shop, 1)... 15480 299395 shop ((shop-a, 1), (shop-a-gogo, 1), (shop-a-holic,... 19935 344979 6.071884 6.506082 8.656748 5.216132
3225 ((version-a, 4), (version-aber, 1), (version-a... 470 1574 ((aktuelle-version-thread, 1), (alpha-version-... 221 267 ((aa-version, 6), (aaa-version, 4), (aac-kapit... 28026 335784 version ((version-a, 4), (version-aber, 1), (version-a... 28717 337625 9.889244 7.065319 7.666834 9.851862
4242 ((software-a-a-service-dienstleister, 1), (sof... 7842 157546 ((abas-erp-software-architektur, 1), (abas-erp... 3472 6311 ((aa-software, 27), (aaa-logo-software, 1), (a... 18664 161393 software ((software-a-a-service-dienstleister, 1), (sof... 29978 325250 10.473335 8.070525 10.927506 10.546890
3335 ((service-a, 1), (service-aafes, 1), (service-... 5797 106357 ((abc-service-fahrzeug, 1), (abendessen-servic... 7385 29408 ((aa-service, 1), (aaf-service, 3), (aai-servi... 18181 185457 service ((service-a, 1), (service-aafes, 1), (service-... 31363 321222 10.458371 8.119012 9.813824 9.648214
2078 ((video-a, 9), (video-aasgeier, 1), (video-abb... 11369 144030 ((aas-video-guide, 1), (abc-video-podcast, 1),... 4910 11052 ((aaa-video, 1), (aac-video, 1), (aanalog-vide... 11947 99349 video ((video-a, 9), (video-aasgeier, 1), (video-abb... 28226 254431 10.268572 9.320311 10.796706 8.534387
7594 ((html-a, 1), (html-a-tag, 3), (html-abbild, 2... 4661 251459 ((abfrage-html-code, 1), (actionscript-html-ja... 595 1017 ((admin-html, 1), (ak-html, 4), (allerwelts-ht... 307 1006 html ((html-a, 1), (html-a-tag, 3), (html-abbild, 2... 5563 253482 3.666283 3.559640 8.375398 6.646972
4458 ((portal-aber, 1), (portal-abgrund, 1), (porta... 1031 4268 ((aai-portal-projekt, 1), (abas-portal-anwendu... 455 719 ((aa-portal, 3), (aachen-portal, 2), (aai-port... 9626 242507 portal ((portal-aber, 1), (portal-abgrund, 1), (porta... 11112 247494 5.268792 7.971472 8.306426 5.054775
2763 ((euro-a, 23), (euro-a-day, 5), (euro-ab, 1), ... 15280 207014 ((aaa-euro-länder, 1), (abo-ein-euro-jobber, 2... 3813 27667 ((ab-euro, 26), (abfindungs-euro, 1), (abmahn-... 1116 5309 euro ((euro-a, 23), (euro-a-day, 5), (euro-ab, 1), ... 20209 239990 9.182818 8.841181 6.246215 7.752058
3517 ((film-a, 3), (film-a-team, 1), (film-ab, 6), ... 6894 46067 ((aachentv-film-previewkarl, 1), (aafa-film-se... 3206 4856 ((aa-film, 6), (aaf-film, 5), (aafa-film, 3), ... 18332 176359 film ((film-a, 3), (film-a-team, 1), (film-ab, 6), ... 28432 227282 11.035423 9.866733 11.067336 10.219587
5788 ((server-a, 1), (server-aachen, 1), (server-ab... 3701 41478 ((ab-server-beratungssystem, 1), (acc-server-p... 3450 8429 ((aa-server, 5), (aaa-radius-server, 1), (aaa-... 12979 169608 server ((server-a, 1), (server-aachen, 1), (server-ab... 20130 219515 9.860805 8.562716 9.883143 8.983436
5947 ((standard-a, 41), (standard-a-anlage, 1), (st... 34316 133992 ((aa-alkalien-standard-batterie, 1), (aas-stan... 2869 3834 ((aa-standard, 2), (aaa-standard, 3), (aaad-st... 9323 72680 standard ((standard-a, 41), (standard-a-anlage, 1), (st... 46508 210506 12.485997 12.465448 11.145521 9.554024
5 ((test-a, 7), (test-a-doo, 16), (test-a-klasse... 5632 32294 ((ab-test-ansatz, 1), (abbott-hiv-test-packung... 2554 5270 ((aa-test, 3), (aa-zellen-test, 1), (aabb-test... 20479 157208 test ((test-a, 7), (test-a-doo, 16), (test-a-klasse... 28665 194772 11.066268 10.068089 9.706320 10.300668
6092 ((blog-a, 2), (blog-a-holics, 1), (blog-a-like... 5122 86186 ((abacus-nachhilfe-blog-freising, 1), (abend-b... 1803 4248 ((aa-blogs, 1), (aaa-blog, 1), (aace-blog, 1),... 14314 100106 blog ((blog-a, 2), (blog-a-holics, 1), (blog-a-like... 21239 190540 10.309378 7.410847 8.329346 10.742056
6443 ((chef-abenteurer, 1), (chef-abgang, 1), (chef... 3241 16607 ((aamc-chef-advocat, 1), (abc-chef-regie, 1), ... 550 713 ((aa-chef, 17), (aaa-chef, 2), (aak-chef, 1), ... 19634 172974 chef ((chef-abenteurer, 1), (chef-abgang, 1), (chef... 23425 190294 10.745503 8.723933 8.869139 10.438633
3367 ((marketing-ab, 1), (marketing-abc, 4), (marke... 4992 79240 ((abgrenzungonline-marketing-mix, 1), (abo-mar... 3887 13511 ((ab-marketing, 1), (abakus-internet-marketing... 3159 90646 marketing ((marketing-ab, 1), (marketing-abc, 4), (marke... 12038 183397 7.892290 8.333371 9.829395 4.582147
5903 ((mini-a, 22), (mini-a-bombe, 5), (mini-a-bomb... 30100 169928 ((aaa-mini-z, 1), (abc-mini-serie, 1), (abente... 2946 4973 ((abi-mini, 1), (abnehmen-mini, 1), (abracadab... 1675 4966 mini ((mini-a, 22), (mini-a-bombe, 5), (mini-a-bomb... 34721 179867 12.181625 11.930079 10.617253 9.186868
4485 ((format-a, 2), (format-abhängig, 2), (format-... 854 3075 ((absatz-format-menü, 1), (ad-format-anbieter,... 490 848 ((aa-format, 49), (aaa-format, 12), (aaaa-form... 12197 171744 format ((format-a, 2), (format-abhängig, 2), (format-... 13541 175667 8.341115 8.208178 8.291751 8.168670
385 ((original-a, 12), (original-a-bajonett-objekt... 23961 161860 ((acronis-original-plugin, 2), (adidas-origina... 643 849 ((ab-original, 3), (abba-original, 3), (abel-f... 1931 4718 original ((original-a, 12), (original-a-bajonett-objekt... 26535 167427 10.258179 10.051894 8.947118 9.376325
8286 ((high-a, 2), (high-abgänger, 1), (high-abo, 2... 18739 158611 ((abhoer-high-tech, 1), (ac-high-flow-sensor, ... 1762 4884 ((aaa-high, 1), (abnehmen-high, 1), (ace-high,... 614 1992 high ((high-a, 2), (high-abgänger, 1), (high-abo, 2... 21115 165487 8.480997 8.243905 6.908082 7.502975
2096 ((club-a, 5), (club-a-go-go, 1), (club-ab, 1),... 4595 33312 ((aac-club-logo, 1), (abc-club-stiftung, 1), (... 3588 6492 ((aa-club, 5), (aaa-club, 3), (aachener-boots-... 12440 124082 club ((club-a, 5), (club-a-go-go, 1), (club-ab, 1),... 20623 163886 10.658397 9.345808 10.875745 9.737219
7244 ((window-abschnitt, 1), (window-adapter, 1), (... 10927 151610 ((ach-ich-bin-ein-tolle-programmierer-denn-ich... 1413 3037 ((aa-window, 1), (ablauf-window, 1), (about-wi... 990 6899 window ((window-abschnitt, 1), (window-adapter, 1), (... 13330 161546 8.817984 8.562264 9.458592 5.069866
1444 ((management-abbrecher, 1), (management-abgäng... 2642 36010 ((aal-management-plan, 2), (aal-management-plä... 6921 38145 ((aa-management, 1), (aaa-management, 2), (aal... 8606 82777 management ((management-abbrecher, 1), (management-abgäng... 18169 156932 10.296542 8.215035 7.759268 9.584250
6950 ((dollar-aber, 1), (dollar-abfindung, 2), (dol... 2239 13059 ((acht-dollar-die-stunde-hilfskräfte, 1), (ach... 2697 7109 ((aber-miliarden-dollar, 1), (abonnenten-dolla... 799 136558 dollar ((dollar-aber, 1), (dollar-abfindung, 2), (dol... 5735 156726 2.269668 8.255339 9.993383 0.521292
2737 ((partner-abend, 2), (partner-aber, 2), (partn... 2957 21889 ((abacus-partner-meeting, 1), (abo-partner-clu... 1097 2089 ((aaa-partner, 1), (aachen-partner, 1), (aag-p... 10483 131853 partner ((partner-abend, 2), (partner-aber, 2), (partn... 14537 155831 8.072252 8.136769 8.850671 7.239502
437 ((mail-a, 6), (mail-a-scientist, 1), (mail-aan... 3217 52856 ((ab-und-zu-mail-leser, 1), (abc-mails-zukunft... 3373 13187 ((ab-mahn-e-mail, 2), (ab-mail, 1), (abacho-ma... 5214 87312 mail ((mail-a, 6), (mail-a-scientist, 1), (mail-aan... 11804 153355 7.569355 6.569940 8.126042 5.812767
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
7398 ((tick-abstand, 1), (tick-adjust-wert, 1), (ti... 235 1191 ((ad-tick-fighter, 1), (art-tick-akteur, 1), (... 61 114 ((abenteuer-tick, 1), (achsen-tick, 2), (acous... 838 1560 tick ((tick-abstand, 1), (tick-adjust-wert, 1), (ti... 1134 2865 8.275945 5.004218 4.984844 8.830567
2150 ((eagle-abgasanlage, 1), (eagle-ableger, 1), (... 745 2090 ((afn-eagle-hörer, 1), (agnico-eagle-mines, 2)... 77 104 ((afn-eagle, 2), (agent-eagle, 2), (aggressor-... 181 666 eagle ((eagle-abgasanlage, 1), (eagle-ableger, 1), (... 1003 2860 8.643641 8.343318 5.996966 5.730284
5261 ((local-absender, 1), (local-act, 1), (local-a... 793 2085 ((act-local-system, 1), (afghan-local-police, ... 149 238 ((all-local, 4), (allgäu-local, 1), (almost-ga... 239 529 local ((local-absender, 1), (local-act, 1), (local-a... 1181 2852 9.070886 8.385968 6.648964 7.036218
7138 ((english-a, 2), (english-abo, 2), (english-ad... 565 1390 ((all-english-breakfast, 1), (american-english... 179 282 ((abci-english, 1), (action-english, 1), (afri... 263 1116 english ((english-a, 2), (english-abo, 2), (english-ad... 1007 2788 8.283730 7.996415 6.893213 5.586125
3022 ((water-absolvent, 1), (water-action, 3), (wat... 645 1260 ((above-water-level, 1), (absolute-only-water-... 421 791 ((aacher-power-water, 1), (abc-water, 2), (agn... 204 723 water ((water-absolvent, 1), (water-action, 3), (wat... 1270 2774 9.379208 8.761515 7.976111 6.086188
8135 ((guitar-abonnent, 1), (guitar-achievement, 1)... 471 1306 ((acoustic-guitar-cover-pop, 2), (acoustic-gui... 267 330 ((ac-guitar, 2), (ac-guitars, 1), (acc-guitar,... 329 1120 guitar ((guitar-abonnent, 1), (guitar-achievement, 1)... 1067 2756 8.409664 7.249696 7.874747 6.462243
2577 ((supporter-account, 1), (supporter-accountsta... 365 957 ((alf-supporters-group, 1), (arminia-supporter... 69 101 ((aachen-supporter, 2), (aarau-supporter, 1), ... 685 1660 supporter ((supporter-account, 1), (supporter-accountsta... 1119 2718 8.906687 7.355464 5.791205 8.121965
3986 ((camera-a, 2), (camera-acting, 36), (camera-a... 443 1151 ((action-camera-szene, 2), (adobe-camera-raw, ... 141 287 ((action-camera, 1), (action-sports-camera, 1)... 454 1271 camera ((camera-a, 2), (camera-acting, 36), (camera-a... 1038 2709 8.846763 7.611801 6.106716 7.642461
6436 ((bell-a-mir, 3), (bell-a-mobil, 1), (bell-abl... 678 1710 ((alexander-bell-straße, 6), (alexander-graham... 178 346 ((adamek-bell, 2), (agogo-bell, 2), (agusta-be... 197 640 bell ((bell-a-mir, 3), (bell-a-mobil, 1), (bell-abl... 1053 2696 8.938646 8.393497 6.470744 6.298479
4779 ((slang-abend, 1), (slang-abkürzung, 2), (slan... 131 537 ((american-slang-übersetzer, 1), (arizona-slan... 45 52 ((abnahme-slang, 1), (aboriginies-slang, 1), (... 874 2091 slang ((slang-abend, 1), (slang-abkürzung, 2), (slan... 1050 2680 8.750162 5.270435 5.402175 8.631970
4373 ((variant-ableger, 1), (variant-abo-umschlag, ... 162 694 ((abessinier-variant-baby, 1), (abessinier-var... 41 54 ((abessinier-variant, 6), (abo-variant, 1), (a... 1057 1916 variant ((variant-ableger, 1), (variant-abo-umschlag, ... 1260 2664 8.835874 5.113765 5.208410 8.949384
3682 ((ambition-a, 2), (ambition-ausführung, 2), (a... 47 159 ((blonde-ambition-show, 1), (blonde-ambition-t... 5 5 ((abenteuer-ambition, 1), (abenteurer-ambition... 1076 2500 ambition ((ambition-a, 2), (ambition-ausführung, 2), (a... 1128 2664 9.204795 4.439136 2.321928 9.153247
1758 ((summer-academie, 1), (summer-academy, 3), (s... 599 1664 ((acw-summer-event, 1), (after-summer-party, 9... 255 385 ((action-summer, 1), (adams-summer, 4), (adobe... 190 561 summer ((summer-academie, 1), (summer-academy, 3), (s... 1044 2610 8.660493 7.707008 7.604803 6.174094
7457 ((valley-abhang, 1), (valley-ableger, 1), (val... 271 405 ((arun-valley-wildlife-expedition, 1), (baross... 357 692 ((action-valley, 1), (adaptronik-valley, 1), (... 451 1484 valley ((valley-abhang, 1), (valley-ableger, 1), (val... 1079 2581 8.736061 7.811883 7.656110 7.078671
6370 ((hunger-abschaffen, 1), (hunger-afrika, 1), (... 546 1352 ((anti-hunger-betroffenheitswear, 1), (anti-hu... 112 268 ((abend-hunger, 1), (abenteuer-hunger, 2), (ab... 432 941 hunger ((hunger-abschaffen, 1), (hunger-afrika, 1), (... 1090 2561 9.017725 8.079859 5.326666 7.720429
2222 ((asphalt-abenteuerspielplatz, 2), (asphalt-ab... 837 2148 ((adfc-asphalt-autobahn, 2), (ama-asphalt-misc... 59 71 ((afrika-asphalt, 1), (akne-asphalt, 1), (alt-... 166 334 asphalt ((asphalt-abenteuerspielplatz, 2), (asphalt-ab... 1062 2553 8.891617 8.451024 5.811719 6.744294
36 ((geek-a-cycle, 1), (geek-abenteuer, 1), (geek... 585 1488 ((access-geeks-blog, 1), (alpha-geek-leben, 1)... 65 83 ((abenteuer-geeks, 1), (action-geek, 1), (adob... 383 972 geek ((geek-a-cycle, 1), (geek-abenteuer, 1), (geek... 1033 2543 8.858112 7.955541 5.879625 7.501461
6070 ((depression-aehnlichem, 1), (depression-aktiv... 341 885 ((alter-depression-schmerz, 1), (an-depression... 99 216 ((abortion-depression, 1), (advents-depression... 571 1417 depression ((depression-aehnlichem, 1), (depression-aktiv... 1011 2518 8.705121 7.382613 5.314685 7.736219
3753 ((maximum-a-posteriori-adaption, 1), (maximum-... 479 1214 ((absolute-maximum-rating, 1), (adoleszenz-max... 66 101 ((absorptions-maximum, 1), (abstimm-maximum, 2... 533 1165 maximum ((maximum-a-posteriori-adaption, 1), (maximum-... 1078 2480 8.868034 7.687014 5.471775 7.828888
1676 ((terrain-abdeckung, 1), (terrain-abhängige, 1... 283 801 ((all-terrain-armored-transporter, 1), (all-te... 148 435 ((action-terrain, 1), (adventure-terrain, 1), ... 638 1197 terrain ((terrain-abdeckung, 1), (terrain-abhängige, 1... 1069 2433 8.822450 6.866251 5.972357 8.168750
3529 ((spinner-abteilung, 3), (spinner-account, 1),... 265 541 ((all-turn-it-spinner-ersatz, 1), (alu-spinner... 45 51 ((aalround-spinner, 9), (aaron-spinner, 1), (a... 705 1698 spinner ((spinner-abteilung, 3), (spinner-account, 1),... 1015 2290 8.918361 7.169125 5.437131 8.320645
6135 ((loser-abschied, 1), (loser-affäre, 1), (lose... 501 1247 ((absoluten-online-loser-vollhorst-preis, 3), ... 75 160 ((abkacker-loser, 1), (abs-loser, 1), (afc-los... 452 783 loser ((loser-abschied, 1), (loser-affäre, 1), (lose... 1028 2190 9.162143 7.968248 5.507344 8.261235
1915 ((argumentation-als-narration, 2), (argumentat... 243 427 ((die-argumentation-des-feindes-kennenlernen, ... 13 17 ((aber-argumentation, 1), (abitur-argumentatio... 1038 1721 argumentation ((argumentation-als-narration, 2), (argumentat... 1294 2165 9.703910 7.413752 3.499228 9.352129
3072 ((expertise-angebot, 1), (expertise-ansatz, 1)... 108 216 ((bh-expertise-template, 1), (re-expertise-kos... 7 14 ((abenteurer-expertise, 1), (abgabenordnung-ex... 887 1911 expertise ((expertise-angebot, 1), (expertise-ansatz, 1)... 1002 2141 9.026928 6.086210 2.495603 8.816009
6235 ((lake-agnes-trail, 1), (lake-aktion, 1), (lak... 342 703 ((abijata-shala-lakes-nationalpark, 1), (andhr... 183 411 ((abaya-lake, 1), (acid-lake, 1), (acres-lake,... 483 1023 lake ((lake-agnes-trail, 1), (lake-aktion, 1), (lak... 1008 2137 8.854502 7.456583 6.071877 7.812522
363 ((overkill-abend, 1), (overkill-action, 1), (o... 203 410 ((action-effekt-overkill-double-feature, 1), (... 47 58 ((abhör-overkill, 2), (abschreckungs-overkill,... 845 1602 overkill ((overkill-abend, 1), (overkill-action, 1), (o... 1095 2070 9.203837 6.959941 5.452640 8.759612
522 ((technology-absolvent, 1), (technology-accapt... 235 532 ((advanced-flow-technology-software, 1), (adva... 195 329 ((abc-technology, 2), (abk-technology, 6), (ac... 629 1206 technology ((technology-absolvent, 1), (technology-accapt... 1059 2067 9.443547 7.171150 7.075364 8.727653
3383 ((program-alerts, 1), (program-analyser-anlage... 240 556 ((albatros-program-men, 1), (annual-program-fu... 53 60 ((aaa-program, 1), (aamc-program, 1), (abap-pr... 781 1206 program ((program-alerts, 1), (program-analyser-anlage... 1074 1822 9.504958 7.019761 5.648394 9.212835
2176 ((pose-aber, 1), (pose-and-mark-modul, 1), (po... 56 109 ((auf-die-pose-werfer, 1), (ich-putz-mich-währ... 10 12 ((abschluss-pose, 3), (abschluß-pose, 1), (abw... 960 1584 pose ((pose-aber, 1), (pose-and-mark-modul, 1), (po... 1026 1705 9.370497 5.244871 3.251629 9.267372
2 ((imitation-assimilation-transformation, 1), (... 48 61 ((albert-koch-imitation-post, 1), (anti-imitat... 24 31 ((abba-imitation, 2), (acad-imitation, 1), (ac... 977 1365 imitation ((imitation-assimilation-transformation, 1), (... 1049 1457 9.694730 5.408208 4.478232 9.579924

1387 rows × 17 columns

Constituent position differences

  • Which anglicisms show the greatest difference in $H_R$ and $H_L$ values?

  • Greater left-hand constituency: more use as modifiers, class conversion from English heads (nouns) to German modifying adjuncts (adjectives)?

Summarizing total entropy

  • Slightly lower overall entropy in the DECOW16bx corpus

  • Preponderance of traditional online text genres in this corpus = less creativity in hyphenated compound formation?


  • Better preprocessing necessary (cf. Evert & Lüdeling, 2001; Lüdeling, Evert & Heid, 2000)

    • More carefully distinguish indigenous and exogenous bases
    • Consider inflected forms and non-hyphenated compounds
    • Take frequencies of isolated base forms into account
  • Proper nouns and verbal phrases as distinct categories

  • Take relative frequencies and entropies in English text into account

Summary and outlook

  • Hyphenation in German compounds exhibits variation

  • The productivity of constituent bases in hyphenated compounds can be evaluated using Shannon entropy

  • The most productive anglicism bases may be those whose meanings are well-understood by German users → inclusion in the German lexicon (cf. Hein & Engelberg, 2017: More common indigenous bases are more productive)

  • Many of these types denote entities in the domains of internet or information technology: Bedürfnislehnwörter 'necessary borrowings' (Carstensen, 1965; Onysko, 2007; Onysko & Winter-Froemel, 2011; Winter-Froemel, Onysko & Calude, 2014)

  • Genre effects may be evident when considering corpora from Twitter, WordPress blogs, and the web

  • Internal capitalization (ReiseCenter, YouTube, etc.)

  • Borrowability index?

That's all! Thanks for listening!


Baayen, R. H. (1993). On frequency, transparency, and productivity. In: Yearbook of morphology 1991. Ed. by G. E. Booij and J. V. Marle. Dordrecht: Kluwer, pp. 109–149.

Baayen, R. H. (1994a). Derivational productivity and text typology. Journal of Quantitative Linguistics 1, 16–34.

Baayen, R. H. (1994b). Productivity in language production. Language and Cognitive Processes 9, 447–469.

Baayen, R. H. (2001). Word frequency distributions. Dordrecht: Kluwer.

Baayen, R. H. (2003). Probabilistic approaches to morphology. In: Probability theory in linguistics. Ed. by R. Bod, J. B. Hay, and S. Jannedy. Cambridge, MA: MIT Press, pp. 229–287.

Baayen, R. H. and J. Hay (2002). Affix productivity and base productivity. Paper presented at ESSE 6, Strasbourg.

Baayen, R. H., R. Lieber, and R. Schreuder (1997). The morphological complexity of simplex nouns. Linguistics 35, 861–877.

Baayen, R. H., L. H. Wurm, and J. Aycock (2007). Lexical dynamics for low-frequency complex words: A regression task across tasks and modalities. The Mental Lexicon 2.3, 419–463.

Barbaresi, A. (2016). Efficient construction of metadata-enhanced web corpora. In Proceedings of the 10th Web as Corpus Workshop, pp. 7–16.

Burmasowa, S. (2010). Empirische Untersuchung der Anglizismen im Deutschen am Material der Zeitung 'Die Welt'. Bamberg: University of Bamberg Press.

Carstensen, B. (1965). Englische Einflüsse auf die Deutsche Sprache nach 1945. Heidelberg: Carl Winter Verlag.

Coats, S. (2018). Variation of new German verbal Anglicisms in a social media corpus. In: Proceedings of the 6th Conference on CMC and Social Media Corpora for the Humanities, pp. 27–32.

Duden (2006). Die Deutsche Rechtschreibung (24th ed.) Mannheim: Dudenverlag.

Eisenberg, P. (2011). Das Fremdwort im Deutschen. Berlin and New York: de Gruyter Mouton.

Eisenberg, P. (2013). Anglizismen im Deutschen. In: Reichtum und Armut der deutschen Sprache : Erster Bericht zur Lage der deutschen Sprache. Ed. by Deutsche Akademie für Sprache und Dichtung, Union der deutschen Akademien der Wissenschaften. Berlin: de Gruyter, pp. 57–119.

Evert, S. and A. Lüdeling (2001). Measuring morphological productivity: Is automatic preprocessing sufficient? In: Proceedings of the Corpus Linguistics 2001 Conference.

Fleischer, W. and I. Barz (2012). Wortbildung der deutschen Gegenwartssprache (4. ed.) Berlin: de Gruyter.

Hay, J. B. (2001). Lexical frequency in morphology: Is everything relative? Linguistics 39, 1041–1070.

Hay, J. B. and R. H. Baayen (2002). Phonotactics, parsing and productivity. Rivista di Linguistica 15.1, 99–130.

Hein, K. and S. Engelberg (2017). Morphological variation: the case of productivity in German compound formation. Mediterranean Morphology Meetings 11, 36–50.

Lüdeling, A., S. Evert, and U. Heid (2000). On measuring morphological productivity. In: Proceedings of KONVENS 2000, pp. 57–61.

Lüdeling, A. and N. H. de Jong (2002). German particle verbs and word-formation. In: Verb-particle explorations. Ed. by N. Dehé et al. Berlin: Mouton de Gruyter, pp. 315–334.

Onysko, A. (2007). Anglicisms in German: Borrowing, lexical productivity, and written codeswitching. Berlin: de Gruyter.

Onysko, A. and E. Winter-Froemel (2011). Necessary loans⁠—luxury loans? Exploring the pragmatic dimension of borrowing. Journal of Pragmatics 43.6, 1550–1567.

Schäfer, R. (2015). Processing and querying large web corpora with the COW14 architecture. In: Proceedings of Challenges in the Management of Large Corpora (CMLC-3), pp. 28–34.

Schäfer, R. and F. Bildhauer (2012). Building large corpora from the web using a new efficient tool chain. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 12), pp. 486–493.

Schultink, H. 1961. Produktiviteit als morphologisch fenomeen. Forum der Letteren, 110–125.

Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal 27, 379–423; 623–656.

Winter-Froemel, E., A. Onysko, and A. Calude (2014). Why some non-catachrestic borrowings are more successful than others: a case study of English loans in German. In: Language contact around the globe. Ed. by A. Koll-Stobbe and S. Knospe. Frankfurt am Main: Peter Lang, pp. 119–142.