Corpus approaches to language in social media / [1 ed.]
9781000915594, 9781000915556, 9781032125701
This book showcases the unique possibilities of corpus linguistic methodologies in engaging with and analysing language
113
19
5MB
English
Pages 398
[399]
Year 2023
Report DMCA / Copyright
DOWNLOAD PDF FILE
Table of contents :
Cover
Half Title
Series
Title
Copyright
Dedication
Contents
Preface/Acknowledgements
1 Introduction
1 Introduction
Setting the stage
1 Introduction
Interconnecting with the digital
1 Introduction
Digital humanities as practices of interconnections
1 Introduction
More than numbers: studying cognition and society through corpus approaches
1 Introduction
Scope and structure of this book
1 Introduction
Scope and structure of this book
About the companion website
References
2 Social media as digital research data
2 Social media as digital research data
The impact of the digital on cognition and society
2 Social media as digital research data
Open source
2 Social media as digital research data
Copyright and ethics
Copyright issues
2 Social media as digital research data
Copyright and ethics
Ethical issues
2 Social media as digital research data
The characteristics of a corpus
2 Social media as digital research data
More than text: corpus metadata, textual markup, and annotation
2 Social media as digital research data
More than text: corpus metadata, textual markup, and annotation
Metadata
2 Social media as digital research data
More than text: corpus metadata, textual markup, and annotation
Evaluating metadata
2 Social media as digital research data
More than text: corpus metadata, textual markup, and annotation
Textual markup
2 Social media as digital research data
More than text: corpus metadata, textual markup, and annotation
Annotations
2 Social media as digital research data
References
3 Fundamentals of corpus linguistics
3 Fundamentals of corpus linguistics
Corpus tools
3 Fundamentals of corpus linguistics
The building blocks of corpus linguistics
3 Fundamentals of corpus linguistics
The building blocks of corpus linguistics
Type, token, lemma
3 Fundamentals of corpus linguistics
The building blocks of corpus linguistics
Frequencies and frequency lists
3 Fundamentals of corpus linguistics
The building blocks of corpus linguistics
Dispersion
3 Fundamentals of corpus linguistics
The building blocks of corpus linguistics
Concordances and key-word-in-context
3 Fundamentals of corpus linguistics
The building blocks of corpus linguistics
Collocations
3 Fundamentals of corpus linguistics
The building blocks of corpus linguistics
Keywords
3 Fundamentals of corpus linguistics
The building blocks of corpus linguistics
Stoplist
Advancements in corpus linguistics
3 Fundamentals of corpus linguistics
A corpus approach perspective on sentiment analysis and topic modelling
3 Fundamentals of corpus linguistics
References
4 Imagining the data: corpus design
4 Imagining the data: corpus design
Setting up the working environment
4 Imagining the data: corpus design
Setting up the working environment
Command-line interface and virtual programming environments
4 Imagining the data: corpus design
Setting up the working environment
A note about programming languages
4 Imagining the data: corpus design
CSV, XML and HTML, JSON
CSV
4 Imagining the data: corpus design
CSV, XML and HTML, JSON
XML and HTML
4 Imagining the data: corpus design
CSV, XML and HTML, JSON
JSON
4 Imagining the data: corpus design
Preserving the data
4 Imagining the data: corpus design
Preserving the data
Internet Archive and the Wayback Machine
4 Imagining the data: corpus design
Preserving the data
WARC format
4 Imagining the data: corpus design
Preserving the data
git
4 Imagining the data: corpus design
Working with digital textual data
Unicode, UTF-8, character encodings
4 Imagining the data: corpus design
Working with digital textual data
Regular expressions
4 Imagining the data: corpus design
Towards data collection
4 Imagining the data: corpus design
References
5 Creating the data: corpus collection
5 Creating the data: corpus collection
Collecting the data: general remarks
5 Creating the data: corpus collection
Collecting the data: general remarks
Crawling and scraping web data
5 Creating the data: corpus collection
Collecting the data: general remarks
APIs
5 Creating the data: corpus collection
General purpose scrapers
#LancsBox
5 Creating the data: corpus collection
General purpose scrapers
Archivebox
5 Creating the data: corpus collection
General purpose scrapers
Trafilatura
5 Creating the data: corpus collection
General purpose scrapers
The coding way: BeautifulSoup
5 Creating the data: corpus collection
Platform-specific scrapers
5 Creating the data: corpus collection
Platform-specific scrapers
Twitter
5 Creating the data: corpus collection
Platform-specific scrapers
Instagram
5 Creating the data: corpus collection
Platform-specific scrapers
Facebook
5 Creating the data: corpus collection
Platform-specific scrapers
YouTube
5 Creating the data: corpus collection
Platform-specific scrapers
Data processing
Dates, time, and Unix time
5 Creating the data: corpus collection
Platform-specific scrapers
Text normalisation
5 Creating the data: corpus collection
Platform-specific scrapers
PDF, Word, images
5 Creating the data: corpus collection
Platform-specific scrapers
Detecting the language(s) used in a text
5 Creating the data: corpus collection
Platform-specific scrapers
Emoticons and emojis
5 Creating the data: corpus collection
Platform-specific scrapers
Hashtags
5 Creating the data: corpus collection
Platform-specific scrapers
Other elements
5 Creating the data: corpus collection
Platform-specific scrapers
Annotations
5 Creating the data: corpus collection
Platform-specific scrapers
Verticalised format
5 Creating the data: corpus collection
Platform-specific scrapers
Exploring the collected data
5 Creating the data: corpus collection
Platform-specific scrapers
Cleaning and formatting the data
5 Creating the data: corpus collection
References
6 Case studies
6 Case studies
Analysing crypto-drug market fora
Background
Context
6 Case studies
Analysing crypto-drug market fora
Corpus design
6 Case studies
Analysing crypto-drug market fora
Data processing
6 Case studies
Analysing crypto-drug market fora
Corpus analysis
6 Case studies
Analysing the language of far-right groups on Twitter and Facebook
Background
Context
6 Case studies
Analysing the language of far-right groups on Twitter and Facebook
Corpus design
6 Case studies
Analysing the language of far-right groups on Twitter and Facebook
Data processing
6 Case studies
Analysing the language of far-right groups on Twitter and Facebook
Corpus analysis
6 Case studies
The communicative modus operandi of online child sexual groomers
Background
Context
6 Case studies
The communicative modus operandi of online child sexual groomers
Corpus design
Data processing
6 Case studies
The communicative modus operandi of online child sexual groomers
Corpus analysis
6 Case studies
References
7 Conclusion
7 Conclusion
A broad view of corpus approaches
7 Conclusion
References
Appendix
Index