# Text-Attack example

The script demonstrates a simple example of using Text-Attack with TensorFlow v2.x. The example train a small model on the IMDB
dataset. Here we use the Text-Attack to create the Adversial example, it would also be possible to provide a pretrained model to the Text-Attack.
The parameters are chosen for reduced computational requirements of the script and not optimised for accuracy.

* reference: https://textattack.readthedocs.io/en/master/

### Text Classification

* Date: 10/24/2024
* Author: Pawan Kumar
* Type of attack: Text-attack

### Metadata
* Dataset: IMDB
* Size of training set: 25,000
* Size of testing set : 25,000
* Number of class : 2
* Original Model: LSTM model trained

In [None]:
"""
Description: Uncomment and run to install libraries. Needed for running first time only.
"""
# !pip install textattack[tensorflow]

'\nDescription: Uncomment and run to install libraries. Needed for running first time only. \n'

In [None]:
"""
Description: import library
"""
# Importing necessary libraries
import os
import numpy as np

import tensorflow as tf
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Embedding, Dense, Dropout, SimpleRNN
from tensorflow.keras.datasets import imdb

from transformers import TFAutoModelForSequenceClassification, AutoTokenizer

from textattack.models.wrappers import ModelWrapper
from textattack.datasets import HuggingFaceDataset
from textattack.attack_recipes import PWWSRen2019
from textattack import Attacker
import textattack

In [None]:
"""
Description: Assigning a flag value for Model Training  or Loading from huggingface.
"""

# Flag to determine whether to train a new model or use a pre-trained one
model_train = True # False-> download from Huggingface

# Step 1: Load the IMDB dataset

In [None]:
"""
Description: Load IMDB data with art functionality.
"""
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=10000)

# Step 2: Create the model

In [None]:
"""
Description: Training or loading the model.
"""

if model_train:
    # Setting up parameters for the IMDB dataset and model
    vocab_size = 10000  # Number of words to keep in the vocabulary
    max_length = 100    # Maximum length of each sequence
    embedding_dim = 16  # Embedding dimensions
    oov_tok = "<OOV>"   # Out of vocabulary token

    # Loading the IMDB dataset
    (x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=vocab_size)

    # Padding sequences to ensure uniform length
    x_train = pad_sequences(x_train, maxlen=max_length, padding='post', truncating='post')
    x_test = pad_sequences(x_test, maxlen=max_length, padding='post', truncating='post')

    # Creating word index for vocabulary
    word_index = imdb.get_word_index()
    word_index = {k: (v + 3) for k, v in word_index.items() if v < vocab_size}
    word_index["<PAD>"] = 0
    word_index["<START>"] = 1
    word_index["<UNK>"] = 2
    word_index["<UNUSED>"] = 3

    # Create an inverse word index to decode integer sequences back to words (if needed)
    inverse_word_index = {v: k for k, v in word_index.items()}

    # creating the tokenizer
    tokenizer = Tokenizer(num_words=vocab_size)
    tokenizer.word_index = word_index

    # Defining the model architecture
    model = Sequential([
        Embedding(vocab_size, embedding_dim, input_length=max_length),
        LSTM(32),
        Dense(1, activation='sigmoid')
    ])

    # Compiling and training the model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    model.fit(x_train, y_train, epochs=30, validation_data=(x_test, y_test))

else:
    # Using a pre-trained model from Hugging Face
    model_name = "finiteautomata/bertweet-base-sentiment-analysis"

    # Load the model
    model = TFAutoModelForSequenceClassification.from_pretrained(model_name)

    # Load the tokenizer
    tokenizer = AutoTokenizer.from_pretrained(model_name)

Epoch 1/30




[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 20ms/step - accuracy: 0.6455 - loss: 0.6067 - val_accuracy: 0.7898 - val_loss: 0.4815
Epoch 2/30
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 19ms/step - accuracy: 0.8382 - loss: 0.3954 - val_accuracy: 0.8062 - val_loss: 0.4408
Epoch 3/30
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 20ms/step - accuracy: 0.8786 - loss: 0.3209 - val_accuracy: 0.8006 - val_loss: 0.4716
Epoch 4/30
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 20ms/step - accuracy: 0.8982 - loss: 0.2883 - val_accuracy: 0.8050 - val_loss: 0.4872
Epoch 5/30
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 20ms/step - accuracy: 0.9219 - loss: 0.2219 - val_accuracy: 0.7987 - val_loss: 0.4955
Epoch 6/30
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 20ms/step - accuracy: 0.9352 - loss: 0.1917 - val_accuracy: 0.7968 - val_loss: 0.5100
Epoch 7/30
[1m782/782[0m 

# Step 3: Create the Text-Attack classifier

In [None]:
"""
Description: create class to design architecture of model wrapper for text-attack
"""

class CustomTensorFlowModelWrapper(ModelWrapper):
    def __init__(self, model,tokenizer,model_type,max_length = None,preprocess_text = None):
        self.model = model
        self.tokenizer = tokenizer
        self.max_length = max_length
        self.preprocess_text = preprocess_text
        self.model_type = model_type

    def __call__(self, text_list):
        for idx,text in enumerate(text_list):
            if self.model_type.lower() == "transformer":
                # Preprocessing for transformer models
                preprocessed_text = self.tokenizer.encode(text,return_tensors="tf")
                preds = self.model(preprocessed_text).logits
                logits = tf.nn.sigmoid(preds)
                final_preds = np.stack(logits, axis=0)
            else:
                # Preprocessing for Other models
                sequences = self.tokenizer.texts_to_sequences([text])
                preprocessed_text = pad_sequences(sequences, maxlen=self.max_length, padding='post', truncating='post')
                preds = self.model(preprocessed_text).numpy()
                logits = np.array(preds[0])
                final_preds = np.stack((1 - logits, logits), axis=1)

            if idx == 0:
                all_preds = final_preds
            else:
                all_preds = np.concatenate((all_preds, final_preds), axis=0)
        return all_preds

# Step 4: Creating The attack Vectors on benign test examples

In [None]:
"""
Description: Generating text attack vector
"""


# Wrapping the model for TextAttack
model_wrapper = CustomTensorFlowModelWrapper(model,tokenizer,"lstm",max_length)

# if transformer no need to assign max length

# Preparing input data for the attack
input_data = [("""Don't waste your time or money on this one. This book is terrible. Whatever happened to Amanda Quick writing great books. She used to be my favorite autor. It will be a long time before I ever purchase another one of her books.""", 0),
             ("I am happy as it was a wonderful experience",1)]
dataset = textattack.datasets.Dataset(input_data)

# Setting up the attack
attack = PWWSRen2019.build(model_wrapper)

# Launching the attack
attacker = Attacker(attack, dataset)
attacked_data = attacker.attack_dataset()

[nltk_data] Error loading omw-1.4: <urlopen error [Errno 11001]
[nltk_data]     getaddrinfo failed>
textattack: Unknown if model of class <class 'keras.src.models.sequential.Sequential'> compatible with goal function <class 'textattack.goal_functions.classification.untargeted_classification.UntargetedClassification'>.
textattack: Attempting to attack 10 samples when only 2 are available.


Attack(
  (search_method): GreedyWordSwapWIR(
    (wir_method):  weighted-saliency
  )
  (goal_function):  UntargetedClassification
  (transformation):  WordSwapWordNet
  (constraints): 
    (0): RepeatModification
    (1): StopwordModification
  (is_black_box):  True
) 



[Succeeded / Failed / Skipped / Total] 1 / 0 / 0 / 1:  10%|██▉                          | 1/10 [00:37<05:35, 37.25s/it]

--------------------------------------------- Result 1 ---------------------------------------------

Don't [[waste]] your time or money on this one. This book is terrible. Whatever happened to Amanda Quick writing great books. She used to be my favorite autor. It will be a long time before I ever purchase another one of her books.

Don't [[desolate]] your time or money on this one. This book is terrible. Whatever happened to Amanda Quick writing great books. She used to be my favorite autor. It will be a long time before I ever purchase another one of her books.




[Succeeded / Failed / Skipped / Total] 2 / 0 / 0 / 2:  20%|█████▊                       | 2/10 [00:41<02:47, 20.93s/it]

--------------------------------------------- Result 2 ---------------------------------------------

I am happy as it was a [[wonderful]] experience

I am happy as it was a [[marvellous]] experience



+-------------------------------+--------+
| Attack Results                |        |
+-------------------------------+--------+
| Number of successful attacks: | 2      |
| Number of failed attacks:     | 0      |
| Number of skipped attacks:    | 0      |
| Original accuracy:            | 100.0% |
| Accuracy under attack:        | 0.0%   |
| Attack success rate:          | 100.0% |
| Average perturbed word %:     | 6.72%  |
| Average num. words per input: | 26.0   |
| Avg num queries:              | 166.0  |
+-------------------------------+--------+





# Step 5: Result of Text-Attack on benign test examples

In [None]:
"""
Description: Displaying result of text attack
"""
# Displaying the results of the attack
for data in attacked_data:
    print(f"Original_text -> {data.original_text()}")
    print(f"Original_text_Label -> {data.original_result.ground_truth_output}")
    print()
    print(f"Perturbed_text -> {data.perturbed_text()}")
    print(f"Perturbed_text_Label -> {data.perturbed_result.output}")
    print()
    print('-'*75)

Original_text -> Don't waste your time or money on this one. This book is terrible. Whatever happened to Amanda Quick writing great books. She used to be my favorite autor. It will be a long time before I ever purchase another one of her books.
Original_text_Label -> 0

Perturbed_text -> Don't desolate your time or money on this one. This book is terrible. Whatever happened to Amanda Quick writing great books. She used to be my favorite autor. It will be a long time before I ever purchase another one of her books.
Perturbed_text_Label -> 1

---------------------------------------------------------------------------
Original_text -> I am happy as it was a wonderful experience
Original_text_Label -> 1

Perturbed_text -> I am happy as it was a marvellous experience
Perturbed_text_Label -> 0

---------------------------------------------------------------------------
