Module Creation

Module creation with NovelAI involves preparing training data as a .txt that is well formatted, and spending Steps to train Prompt Tuning vectors on this data. Various tools exist to help you format your training data appropriately and create an effective Module.

= Condensed Guide =

1. Theme
Modules are best for nudging the story into a general theme or an author's writing style. Put precise details in Lorebook.

2. Curate
Consider what texts in your training data would add or reduce quality. Be critical of what to include.

Ex. 3mb to 5mb, 2k steps 8mb to 10mb, 3k steps.
 * Does it contribute significantly to the theme?
 * Is prose high quality?
 * How large is it? What portion of the data should a story take up?
 * Does it have variety? You may end up with reoccurring characters otherwise. It's best to have books that don't come from the same series.
 * How many steps are in your budget? How does that scale to your data's size?

Read reviews of books to determine if they meet the criteria above. Goodreads is a good source of reviews as well. For example, a sci-fi module can be trained with 10 books across 8.5mb. Each author with one book, if you're going for variety. Take text. They can't sue you (yet).

3. Clean
Very important step. Search for useful tools. (Ex. Gnurro's Reformatter)

Fix errors in OCRs and scans. If you're using an ebook look for guides on the internet to convert to .txt Remove all chapter headings, all headers, and footers. Remove word-wrap. This is when words or sentences are broken with a- dash to the next line. Replace characters like smartquotes with normal characters if you don't use them. Remove stray spaces/newlines at the end of paragraphs. Remove or add spaces according to preference. Some stories have double spaces between sentences. Mark chapter transitions or scene transitions with *** (optional?) Maybe some dialogue editing.

4. Train
"How many steps should I take?"

It depends. We are playing with blackbox black magic after all. Rough estimates seem to indicate if you want an 'influence', collect a lot of data and use however many the training tool recommends to hit 100% coverage. If you want specific lore, as in a single series or if you want to be in the books as self-inserter with those established characters, themes and word choices of the particular source material go beyond 100%, the higher and more steps the stronger. At that point you're intentionally overfitting your module, making it better at replicating the source data and worse to some degree at general performance. You'll also be running into diminishing returns, due to how much data you can compress into a single 200k module.

5. Test
There's several ways to test your module: If you feel like it didn't quite match your vision, go back to
 * Blank prompt starting with just ***
 * A series of short one-sentence prompts
 * Within an existing scenario. Some tags can interact poorly with your module, so change them if needed.

= Gnurro Reformatter Guide = | Gnurro Reformatting Tool


 * 1) Install python + pip
 * 2) Shift right click in gnurro folder, "open powershell window here"
 * 3) Type "pip install x" (where x is one of the items in requirements.txt); Repeat for all items
 * 4) Open baseGUI; use tool to open trainingdata.txt
 * 5) Mode > initialprep
 * 6) Click all quick fix buttons except for "remove block layout" and "remove bad paragraph breakers"
 * 7) Mode > source inspector
 * 8) Click the arrows, fix bad paragraph breakers manually
 * 9) File > save
 * 10) PROFIT
 * 1) PROFIT

= Adding Images to Modules =

= Turning Modules to Do/Say Adventure Mode =
 * 1) Create Module.
 * 2) Open Module in text editor.
 * 3) Change "Mode" from 0 to 1.
 * 4) Delete Module existent in NAI.
 * 5) Import edited module.

= Technical Explanation of Prompt Tuning =

= Other Module Creation Guides = | Manwhore's Module Making Guide

| OccultSage's Module Making Guide