girlrock: (Default)
[personal profile] girlrock

disclaimer: not a stats major. i wasn’t able to scrape all the fic data the way i wanted to, so this post might not be very interesting—sorry in advance for the fact that a lot of the summaries are fairly basic! hopefully i’ll be able to figure out a better way of pulling/presenting data by next year and return with a more salient post then. any feedback is of course still appreciated ^__^
(data taken january 5, 2021)


METHODOLOGY
i used this scraper to facilitate some data lookup, while the rest was just me looking at simple numbers with ao3 search methods and compiling them to be easier to look at. any minor discrepancies between results are most likely due to the fact that the scraper can’t look at user-locked works in the archives, but thankfully the txt tag is 99.2% public and this shouldn’t be a huge issue.

i also want to clarify the nuances between these three metrics, as i’ll be using the latter two in my own analysis:

date posted: the date a fic is first posted to the archive, either as its first chapter(s) or as a single-chapter fic. this data is currently not easily obtainable on ao3 and therefore cannot be used to clearly delineate fandom growth trajectory.

date updated: the date a fic was last updated. if a fic is a complete and single-chapter fic, this will be the same as the date posted. the problem is that a multi-chapter fic posted in 2017 that gets updated a year later will lose the original 2017 datapoint, and we’ll only be able to see this fic as belonging to 2018. this is still a reflection of active fandom participation—the fact that people are updating their fics into another year—but it doesn’t necessarily show us that people are producing new fic.

date completed: a presumptuous name for the “date updated” of a fic marked as complete, whether single-chapter of multi-chapter. i.e. this assumes that all completed fics are not being updated again after they’ve been marked as complete, but this is probably a trivial enough issue that it won’t affect our general understanding of fandom growth. neither of these metrics are perfect, but they should suffice for the numbers we’re looking at! i will use either updated or completed methodology (seemingly arbitrarily but i kind of have some reasoning behind it most of the time…)


BASIC DATA
>> category distribution
>> rating distribution
nothing to write home about here: the category distribution is unsurprising, given… you know, boy group fandom. the rating distribution is definitely more pg than some from bigger k-pop boy group fandoms on ao3, but it should also be fairly self-explanatory due to the members’ ages.
when the tag is filtered solely by “explicit” fic, yeonbin make up 42.6% of the tag instead of 34.4% of the general txt tag, potentially because they’re the oldest members. but what does the rating breakdown look like for all ships?
here, the ships are sorted by their percentage of explicit fic. a reminder that i cannot control txt ficdom and i have not personally written any of these fics; i’m just doing this for the data. after yeonbin, beomkai and taebin also have significant amounts of explicit fic, while yeonkai have the highest percentage of mature + explicit fic combined.

>> top 12 additional tags
ao3 tag wrangling is kind of finicky so these numbers might vary somewhat depending on how you view the tag (e.g. “friends to lovers” here doesn’t include “enemies to friends to lovers,” but it is counted as a same-meaning tag in the archives). a lot of people also tag their fics with variations of “fluff and angst” or “fluff and smut and angst” or “fluff and smut” which complicates things further, but the main takeaway here is that txt fans love college aus and that fluff is more common than angst.
around 32.3% (or 1 in 3) of txt fics in general are tagged as “fluff” without any modifiers, and around 44.1% are tagged as any form of “fluff.”

>> top 10 ships



mean fic count per 2-person ship: 456.2

as you can see, yeonbin are easily the most popular txt ship. out of the current 4,201 works at the time of data retrieval, 34.4% of them have been tagged as yeonjun/soobin. another way of looking at it is that yeonbin still have more fic as 1 ship than the bottom 6 txt ships combined.

2020 also saw large growth for taegyu and soogyu fic, who had positive rank changes between their 2019 and 2020 numbers. taegyu only recently overtook sookai as the #2 ship in the txt archives!


FANDOM GROWTH


again: date updated is a dynamic number and not a perfect reflection of fandom growth. but looking at these graphs, we can objectively see that txt fandom has experienced huge growth in 2020, and it’s impressive that we’ve already seen an update count of 100+ five days into 2021. this means that not even 2% into the year, we’ve already posted or updated over 22% of the fics last updated in 2019.

looking at the monthly completion graph, we can also see that there are general surges in fic counts around comeback time (during the same month, or into the next if an album is released later in the month). i’m sure the numbers speak for themselves, but it’s pretty exciting to see how much the ficdom has grown since blue hour!
a final way of looking at this growth is that 476 completed fics were last updated in december 2020 alone, while only 429 completed fics were last updated in the entire year of 2019. in short, our output in literally one month of 2020 surpassed our entire 2019 output.


DISTRIBUTIONS
>> kudos distribution
the agenda here is to show that most txt fics don’t have 100+ kudos and that it’s even rarer to hit 1000—the kudos median is actually somewhere in the 50s range.

of course the numbers don’t really matter, but if you’ve been feeling bad about not having a high amount of kudos, then don’t worry: you’re probably doing fine! and if these numbers somehow discourage you because you wish you were writing for a fandom with huge readership and not an up-and-coming 2019 k-pop boy band slowly climbing its way to ao3 relevance, then… i’m sorry for being the bearer of bad news.

>> word count distribution
sorry i don’t have the actual fic distribution due to my data scraping limitations, but i think it’s pretty fascinating how much ao3 trends toward shortfic without us really realizing the extent of it (or maybe it’s just me). as you can see, most txt fics are well under 5k words long and only 15% are over 10k!


“PURITY” METRICS
♡ welcome to the experimental section! ♡

what does “otp: true” even mean?: otp: true is a search filter that returns fics where only one ship is tagged. it doesn’t account for multiship fics where only one ship is the main pairing—instead it excludes the idea of multishipping altogether.

here, we see that yeonbin—the biggest ship in the tag—is the most “pure” percentage-wise, meaning that they are exclusively shipped in over half of their fic. yeonkai are also at the bottom of the purity rating, but they’re already the least popular txt ship so there is no change in their ranking. we then the largest drops for sookai and taegyu, who fall 4 and 6 spots respectively.

further meditation: can we expand on this idea of ship purity? one thing that interests me about fandoms where one ship is clearly ahead is the idea of “secondary pairings” and how fans often pair off a group into convenient ships. this means that yeonbin can’t overlap with sookai or beomjun, but it can overlap with taegyu and tyunning. the problem is that we can’t really scrape whether a fic has a ship tagged as the main pairing or the secondary pairing, which is a far more interesting insight than what “otp: true” offers, but we can still look at which ships contribute to these purity drops.
just by looking at the taegyu tag, we see that 301/757 = 39.7% of its works are tagged as yeonbin, while sookai’s second-largest ship (beomjun) only accounts for 75/598 = 12.5% of the tag. interestingly enough, sookai’s large drop in “purity” seems to be attributed mainly to the fact that 136/598, or 22.7% of its works, are also platonically tagged as “soobin & hueningkai.”
the tenuous hypothesis is that ships that don’t include yeonjun or soobin are often multishipped with yeonbin (the largest ones being taegyu at #2 and tyunning at #6) and potentially suffer from “2nd pairing” tagging syndrome. of course, it’s important to note that there could also be a substantial amount of multiship fic where yeonbin are actually the side pairing.


SHIPPABILITY?

sorry if this seems kind of nonsensical or hard to read… the table and chart are the same data, just two different ways of looking at it: the left side is the aggregate count of all 4 two-person txt ships per member, i.e. yeonbin - soogyu - taebin - sookai for soobin. the right side is just their general tag counts as characters on ao3.

soobin is the most-tagged and most-shipped txt member (within txt ships, that is) on ao3. hueningkai and taehyun have similar tag counts in general, but taehyun seems to be shipped a little more; otherwise the numbers are pretty consistent between both methodologies.


CROSSOVERS & THE “BTS EFFECT”
welcome again to the experimental section ♡ disclaimer: this data was taken purely out of curiosity, and i am not trying to imply anything about txt or bts or their professional relationship or Company Stanning or anything at large, and i also extremely do not care at the end of the day. it’s common rhetoric that many txt fans are bts fans (once again—not saying this is a GOOD OR BAD THING), but is bts/txt crossover fic becoming more or less common as txt grow? are txt the ficdom with the largest bts presence to begin with, or does that title go to a non-bighit group?
top k-pop fandoms on ao3 vs. % of fic tagged as bts
from this table, we can see that txt definitely have a pretty large percentage of bts fic in their archives, but they don’t seem to be the top group (i arbitrarily pulled numbers for the largest boy group/girl group fandoms and had no other group selection methodology, so this could be missing significant groups). [this isn’t a bts post so i won’t go into detail about bp and got7 being at the top of the list, but... i think we can all see that it’s not particularly surprising.]

so, around 12% of txt fic is also tagged as bts. how is this percentage actually trending?

2019 bts fic: 99/566 = 17.4%
2020 bts fic: 386 / 3517 = 10.9%
as we can see, the percentage of bts fic in the txt archives has gone down significantly since 2019. of course, the raw number has increased with the general growth of txt fandom, but the % itself has nearly halved. bts also remains the largest crossover fandom in the txt archive. here is a look at other common fandoms:
top k-pop fandom crossovers in the txt ao3 tag
(note: i displayed top fandoms + a few 4th-gen groups i was curious about, so there might be groups missing in the middle)

that’s all i’ve got for crossovers! stray kids and ateez have large ficdoms and are also 4th gen groups, so it’s not surprising that they have a high amount of crossover with txt.

it's also worth noting that the bts numbers don’t really tell us whether txt or bts are the main fandom—it’s possible that we are seeing an increase in txt fic with background bts characters, as opposed to bts fic with background txt characters. there is also a small but significant amount of txt/bts ship fic. it’ll still be interesting to see what this percentage looks like in 2021, though!


CREDITS
some formatting inspired by tumblr user vppax’s bts stats post and toastystats’s general ao3 work. i don’t really think i risk either of them running into this post, but if anyone does see this then i sincerely hope you don’t mind that i blatantly plagiarized the kudos/wc distribution table idea... heh...

thanks for reading! ♡

Date: 2021-01-07 02:04 pm (UTC)
permutative: (Default)
From: [personal profile] permutative
i feel like i had more thonks when i was reading this at 5am but on the reread i'm just like... wow this is so pretty... as always i'm impressed at your ability to FORMAT THINGS so nicely & ofc i'm obsessed with the intersection of my two interests (numbers and txt) ummm what else
> find it interesting that beomkai & taebin have a lot of explicit fic? i would've guessed sookai would be #2 hahaha. maybe it's cuz they're less popular so they can be more easily skewed by, say, one author writing a lot of pwp or something
> not surprised by the uni aus
> IM OBSESSED WITH HOW MUCH TAEGYU HAS GROWN!? truly so happy for them even if i think a part of it is that yeonbin/txt ficdom at large has grown & then taegyu is a common side-pair... like sookai *can't* be a side to yeonbin so it's at a disadvantage i guess. sigh
> soogyu gaining popularity is surprising [because i assumed it was always popular i guess? like mostly everyone i follow is chwe-line biased lol]
> AMAZED AT TXT FANDOM GROWTH IN GENERAL. can't wait for 2022 when theres like 2k+ works in the taegyu tag <3 more food for me <3
> the shippability thing... lol not surprised at soobin even though i feel like HUENINGKAI is the hidden person who can actually be shipped with everyone too. :)))
> re: crossovers i'm determined to make txthypen overtake btxt. lets do it!!!!!!!!!! (also i don't get why the bp/got7 thing isn't surprising but i also don't know/care enough)

anyways this was delightful, thank you so much for your scholarship, you're truly the invisible backbone of the txt fandom, etc etc <3

Date: 2021-01-07 03:49 pm (UTC)
lapiscave: (Default)
From: [personal profile] lapiscave
ur amazing this was so fun. i think my only comment is it's interesting to see how beomgyu ships became more popular in 2020 and my fav chart was the timeline graph of fics with the comebacks bc u could see the increase with every cb. thank you for this incredible insight :O

Date: 2021-01-07 08:57 pm (UTC)
lovebalance: (Default)
From: [personal profile] lovebalance
this was such a fun little experiment, i love how you complied the data and put everything together!!! it was fun seeing how soobin is the most shipped member in txt!!! i wonder if soogyu will continue to grow in 2021 as well!!!