min.occ: word (p-value)	Classes distributions
1	⨯
2	⨯
3	⨯
4	⨯
5	⨯
6	⨯
7	⨯
8	⨯
9	⨯
10	⨯

MCC: 0.84			Σ
Predict:	Value	Value	Value
Predict:	Value	Value	Value
Σ	Value	Value	Value

			Σ
	⨯	⨯	⨯
	⨯	⨯	⨯
Σ	⨯	⨯	⨯

Explanations By Boundary Exploration for Textual data

Choose a neural network (NN) and a dataset size

GPT-2

Transformer NN with attention and no reconstruction task

This implementation uses the huggingface.co implementation for sequence classification of GPT-2. Texts are embedded in 1,024 dimensional space.

10,000 entries 50,000 entries 100,000 entries

Reconstruction RNN

RNN without attention and with a reconstruction task

This implementation uses a simple recurrent neural network presented here. Texts are embedded in 1,024 dimensional space. Texts are troncated at the 20-th word. For this network and this dataset UMAP is used with a number of neighbors set at three instead of two.

10,000 entries 50,000 entries 100,000 entries

EBBE-Text is here presented with 2 differents NN. However, EBBE-Text could be use with any classifier which create an embedding for each inputed data.

General information

Change sorting

Explanations By Boundary Exploration for Textual data

Choose a neural network (NN) and a dataset size

GPT-2

Transformer NN with attention and no reconstruction task

Reconstruction RNN

RNN without attention and with a reconstruction task

EBBE-Text is here presented with 2 differents NN. However, EBBE-Text could be use with any classifier which create an embedding for each inputed data.

Stop words list

General information

Change sorting

Explanations By Boundary Exploration for Textual data

Choose a neural network (NN) and a dataset size

GPT-2

Transformer NN with attention and no reconstruction task

Reconstruction RNN

RNN without attention and with a reconstruction task

EBBE-Text is here presented with 2 differents NN. However, EBBE-Text could be use with any classifier which create an embedding for each inputed data.