The E-book industry in Japan is gaining momentum. Both contemporary releases and more popular classics are seeing an increase in digital publication alongside physical distribution. Through their kindle service Amazon.co.jp has a particularly strong hold on this market, and their extensive catalog could offer good news for any intermediate or advanced Japanese language learner. It is easier to look up new vocabulary from digital media than it is to look something up from print media, and both physical Kindles as Kindle for Windows are surprisingly feature-rich, including in-built dictionaries and flashcard options.

In prior blogs I’ve described my method of reading digital texts as HTML files through a web-browser; using pop-up dictionaries such as Rikaisama or Yomichan in combination with flashcard application Anki for ‘vocab mining’22 native materials.19 This blog describes options of combining that method with kindle E-books, as well summing up several other alternative sources of E-books. Conclusively it is possible to efficiently utilize the Japanese kindle market for study purposes, but doing requires a bit of technical prowess and borders on technical illegality.

Amazon Kindle

For the past two years or so I’ve often relied on Amazon.co.jp’s Kindle service for both academic purposes and as leisure, especially while living in Japan.5 Their selection of Japanese language material is immense, with Kodansha-published E-books alone accounting for over 40.000 titles in their library. Having relied on above-mentioned study method for a while now I looked into my options of doing so with Kindle E-books as well.

Through Browsers

The first option would be to open Kindle E-books with my browser and use Rikaisama or Yomichan to import vocabulary into Anki. When purchasing an E-book on Amazon, one specifically purchases the right to read the E-book and download an amazon proprietary file through the Kindle software (the .AZW format). This proprietary format is practically identical to the .MOBI format (which again is basically a more highly compressed HTML file, specifically using the Ebook HTML syntax) with additional DRM-protection on commercial releases as attempt to counter copyright infringement.

Both Chrome and Firefox have plug-ins for parsing those two formats, but these plug-ins are frankly not sufficient for our purposes, and for obvious reasons don’t support files with DRM. One option would be removing the DRM through software and converting the Kindle .AZW format to a ZHTML format, which is quite is literally a compressed (zipped-up) HTML format; unzipping the .ZHTML file and opening the resulting HTML page with your browser of choice is all it takes to achieve our goal of simultaneously reading and vocab Kindle E-books with Yomichan or Rikaisama.

Converting an E-book is a simple process done with the popular open-source tool Calibre. Calibre supports third-party plug-ins, including one used to bypass E-book DRM. While the process would thus be as simple as locating your legally purchased Kindle E-book in .AZW format with Calibre, converting the book to ZHTML, extracting the ZHTML and opening the content with a browser,20 I had some interest in the legal aspects of bypassing DRM and converting E-books for such fair-use purposes. International copyright law in the digital age is an ever-changing field, and I am not a legal expert by any stretch, but I will briefly summarize my understanding of the current-day situation.

Legality

Kindle E-books are protected by DRM (Digital Rights Management), a form of Technological Protection Mechanisms (TPM) to prevent copyright infringement on a digital level (i.e. piracy). DRM is a controversial topic and while some industries, such as the music industry, are backing away from DRM (no doubt due a rise in streaming services), DRM mechanisms are still inherent to the E-book industry. Regardless of one’s ethical principles, the legality of fair use DRM stripping (i.e. place shifting as means of back-up, or format shifting as described in this article) should be touched upon. If one has for example a large, expensive library of DRM-protected E-books and its distributor goes out-of-service, or the tools to read those E-books with are no longer supported, what options remain for the customer to enjoy their legally-obtained works?13

Technically, removal of digital copy protection mechanisms such as DRM fall under anti-circumvention laws. Now, laws might differ depending on the region, and although laws could directly forbid the removal of digital copy protection mechanisms, others might contradictory permit shifting of formats or creating back-ups for personal use. Nevertheless, most countries today are members of the World Intellectual Property Organization (WIPO) and adopt the World Intellectual Property Organization Copyright Treaty (WIPO Copyright Treaty or WCT). While specific implementations of the WCT differ on its member states, WIPO (a United Nations agency) explicitly states the general act of DRM-removal as illegal.

USA

The anti-circumvention of the WCT-inspired DCMA for example, as upheld in the United States, explicitly states that while circumvention of copy-control measures8 is not illegal, circumvention of access-control measures very much is. The latter (quite literally measures to control which platform can access the media) applies to E-books as well, and thus format-shifting, regardless of fair-use intent,11 does fall under this category. Unfortunately, the reality is that most DRM implementations contain both measures. Furthermore a controversial 1999 ruling held the very act of linking to pages hosting circumvention software as illegal trafficking. There is however a more recent legal case concerning this topic, in which an E-book store, contractually obliged to sell E-books with DRM, was sued after closing down and disclosing information on how to remove this DRM. Although they were sued for inducement of contributory infringement, a federal judge in New York ruled that A) Abbey House did not induce infringement as there was no factual direct knowledge of infringement, and B) the infringement referred to was of illegal redistribution, not of DRM removal.10

EU

The European Union implements WCT through the Copyright Directive (also known as the InfoSoc Directive), which had a controversial update just weeks before this post (26 March 2019).12 DRM falls under the 2001-era Article 6(3) and Article 7(2), and while member states of the European Union have again different interpretations and implementations of these articles, there are little to no practical exceptions protecting the individual in the actual Copyright Directive.1415

Japan

Japan too implements WCT as WIPO member. Although copying media for back-up purposes itself is legal, it is not just the illegal distribution of copyrighted material on-line, but downloading of music and movies that are considered offenses under criminal law as well.1716 As for Technological Protection Measures, the topic of DRM in particular falls under Copyright Act Article 2.1 Clause 20 (著作権法2条1項20号) and Article 2 Section 7 of the Unfair Competition Prevention Act (不正競争防止法2条7項).18 Circumvention (「技術的保護手段の回避」) of both copy-control measures and access-control measures are from a legal point of view criminal offenses.

On a final note, when someone purchases media such as Kindle E-books through a distributor such as Amazon, they gain limited rights of access, just as a distributor has limited rights of distribution. These rights are, unlike digital media in physical formats (such as CDs), determined through a contract with the distributor. Regardless of the applicable law, removal of DRM is foremost a breach of the Terms of Service one signs when signing up with Amazon and the Terms of Service does generally uphold in court.

TL;DR

To throw in my own two cents, I believe the technical implementations of anti-circumvention law are outdated and do not accurately represent the role of digital data in our lives. Furthermore these laws imply circumvention of copyrighted protection to be done solely for the purpose of illegal redistribution. Having said that, I also sincerely doubt anyone to actually face legal consequences for such benign personal fair-use purposes such as converting a Japanese language E-book to an HTML format for vocab-mining purposes (as implied in this article), and as of this article’s date there are no such legal precedences (not to mention the lack of traceability when done in private spheres). From a legal perspective however, bypassing DRM is illegal in WIPO countries and punishable by the legal framework of the country one resides in, with at least the possibility of termination of services if detected.

Through the Kindle Application

With above-mentioned issues of legality, detractors of that that method could op for another method: creating flashcards based on using your Kindle hardware and/or the Kindle application itself.

While reading a novel, one can look up words using in-built dictionaries (e.g. J - E, or J - J). These “look ups” will be saved in a vocabulary file and in case you’re using Kindle hardware, synced to your desktop Kindle application: a vocab.db file. Using the service on https://fluentcards.com/kindle, one can then convert the vocab.db file client-side and import the result into Anki. To add example sentences, you’d be required to convert the anki note-type to the one we’ve been using prior and use the bulk-edit feature of the Example Sentence anki plug-in.

Unfortunately adding audio is a bit more complicated. One could rely on text-to-speech services like the Anki plug-in AwesomeTTS. Another option would be to bulk-edit cards to import audio from Jpod101’s database, just like Rikaisama and Yomichan do. Unfortunately there is no such plug-in yet. I will look into it myself at a later time.

Geo-blocking

For those not currently residing in Japan, I should also mention that there are some (legal) limitations to purchasing Kindle E-books outside of Japan. As of yet, Amazon has due licensing and logistical reasons not yet opened its E-book market to the international E-market, and similar to how video-streaming services as Netflix and Prime Video block content based on region (geo-blocking), kindle has certain technical limitations as well. Those without an official address in Japan would require at minimum an Amazon.co.jp account with existing Japanese address, and kindle E-book downloads abroad, while not blocked, are at the very least limited in frequency one can purchase.23 Interestingly, Japanese E-book competitors https://honto.jp/ and https://www.ebookjapan.jp/ebj/, while obviously enforcing DRM, do not employ such geo-restrictions.

While outside the scope of this article, googling these topics reveal many threads on bypassing geo-blocks to purchase digital contents abroad and offers an interesting case-study within the concept of economics of digitization. The European Commission has voted in favor of a Digital Single Market for countries already belong to the European Common Market, but as far as I’m aware the concept of an international digital single market crossing existing market boundaries seems unlikely for now.

Aozora

Several years ago it was still a challenge finding native materials on-line. One option remains Aozora — the Japanese answer to the Project Gutenberg, which freely hosts tons of public-domain books online.221 Due the nature of public domain literature,4 these might not be that accessible or entertaining even for the casual intermediate-level learner; but regardless there are some absolute classics such as Natsume Soseki’s Botchan (坊っちゃん) and Tanizaki Jun’ichiro’s In Praise of Shadows (陰翳礼讃) which both content-wise and length-wise are actually quite doable after just several years of studying the language.

While you can read Aozora novels as-is online using Yomichan or Rikaisama, I personally recommend downloading the file and formatting the text-file using Jnovelformatter to be more comfortable on the eyes.3

Other

I should write a more extensive blog on this topic at some point, but while not enjoying the same popularity as approximately 15 years ago, the cellphone novel genre is still very much alive and many successful releases still find their way to the physical world. Many of these are romantic of nature or deal with daily life issues as seen from the perspective of teenage girls or young adult women. The genre’s target audience is evident when accessing the most popular platform, Maho i-Land, greeting the visitor with a slogan claiming to be 「日本最大級のガールズポータルサイト」 (Japan’s largest girls portal site). Due their very nature these works are quite accessible even for early intermediate students, and can be read on-line using my suggested method.

Another interesting source of literature is the Japan P.E.N. Club Digital Library, an international association of progressive intellectual writers. The Japanese movement has strong ties to important Japanese writers as Endō Shūsaku and Kawabata Yasunari, and is part of a larger association with ties to Belgian Nobel prize winner Maurice Maeterlinck, Heinrich Böll, Jorge Luis Borges and even J.K. Rowling to name a few. I recommend Kawabata’s One Arm (片腕).

Piracy of Japanese media is widespread, and while some argue piracy has led to the success of Japanese animation in the west in the first place, the argument that piracy is killing the industry has led to harsh crackdowns on piracy in Japan. Nevertheless, there does seem to be an active piracy scene in Japan, spreading digital versions, formatted in as .TXT file, of more popular modern literature; such as light novels. For obvious reasons I can’t provide any sources, but this could make a good topic for a future article, perhaps.


  1. The official logo in celebration of 150 years of friendship, courtesy of the Embassy of Japan in Belgium

  2. According to Wikipedia, they host over 10.000 works including both out-of-copyright works or those made freely available by the authors. Read more: https://en.wikipedia.org/wiki/Aozora_Bunko 

  3. While the Aozora-recommended web-application airzoshi is an attractive alternative, its formatting effectively blocks the usage of Yomichan or Rikaisama, rendering the whole exercise pointless. 

  4. Public domain literature. 

  5. Honestly, I’ve generally relied on Amazon living in Japan. Business practices aside, Amazon really is incredibly convenient and as student you get 1 year of Amazon Prime account, with access to Kindle Prime Reading and Prime Video, for free. 

  6. There is actually a preceding legal case on this topic as seen in Abbey House Media v. Apple Inc. In short, an E-book store contractually obliged to sell E-books with DRM closed down and disclosed information on how to remove this DRM. Although they were sued for inducement of contributory infringement, a federal judge in New York ruled that A) Abbey House did not induce infringement as there was no factual direct knowledge of infringement, and B) the infringement referred to was of further illegal distribution, not of DRM removal. Gizmodo has a brief piece on this at https://gizmodo.com/its-perfectly-legal-to-tell-people-how-to-remove-drm-1670223538.  

  7. http://www.dmlp.org/legal-guide/circumventing-copyright-controls 

  8. Thus under this provision of the DCMA, it would be legal to bypass copy-control measures for private back-up purposes. 

  9. While specific implementation of these laws differ on region, the World Intellectual Property Organization (WIPO) explicitly states the general act of DRM-removal as illegal https://www.wipo.int/ip-outreach/en/ipday/2016/ip_digital.html

  10. Abbey House Media v. Apple Inc. Gizmodo has a brief piece on this at https://gizmodo.com/its-perfectly-legal-to-tell-people-how-to-remove-drm-1670223538.  

  11. https://info.legalzoom.com/dmca-backup-copyrighted-content-22827.html https://www.wired.com/2010/03/dmca-muscle-strong-arms-dvd-copying/ 

  12. Directive Article 17 (known as Draft Article 13) makes on-line platforms directly liable for copyright infringement by its users and could lead to implementation of filters to remove copyrighted material on most big on-line platforms. Directive Article 15 (known as Draft Article 11) will effectively limit social media and search engines in their capability of aggregating and hot-linking. Both articles are widely criticized controversy lies in the (solid) assumption this article will lead to a decrease in creative content and limit access to information, but member states have another two years to implements these measures and it is yet to see how social media platforms will reply. On a positive note, the EU did at least implement an exception in copyright law for scientific text and data mining (TDM) purposes. ¯_(ツ)_/¯ https://www.wired.co.uk/article/what-is-article-13-article-11-european-directive-on-copyright-explained-meme-ban https://copyrightandtechnology.com/2018/09/13/eu-parliament-approves-watered-down-copyright-directive/ 

  13. Several start-ups are playing with the idea of using blockchain technology to counter some of the problems inherently tied to E-book DRM. An interesting read https://www.forbes.com/sites/billrosenblatt/2018/08/18/can-blockchains-disrupt-the-E-book-market-two-startups-will-find-out/#5d7a84435a0b. Another solution is watermarking, allowing the user more ownership over the E-book. https://copyrightandtechnology.com/2016/10/05/E-book-retail-platform-offers-choice-of-watermarking-or-drm/ Speaking of blockchains and DRM, Sony actually filed a patent for in 2018 this as well. https://www.ccn.com/sony-files-for-blockchain-fueled-drm-patent 

  14. This absurd 2013 article even goes so far to suggest that the technological prowess of circumventing DRM should be interpreted as a form of cyber crime and thus fall under criminal law, rather than seen as a potential civil offense. 

  15. Although some member states distanced themselves from implementing any specific measures, such as Poland and Portugal https://www.communia-association.org/2017/10/11/european-parliament-talking-drm-right-now/

  16. Although to be fair, there is quite some leeway in Japanese Copyright Law when it comes to the industry of derivative works (二次創作 nijisosaku or 同人誌 dojinshi). 

  17. Recently proposed and highly controversial changes to Japanese copyright law would extent that scope to any copyrighted material without permission, as an attempt specifically meant to counter piracy of Japanese comics. As is, the implications of this for regular Internet users could however be quite severe, with little legal ground to stand on, and are often compared to the EU’s implementation of Draft Article 11 and 13. Prime Minister Abe has decided to postpone the bill for now. https://japantoday.com/category/crime/digital-dilemma-japan-flirts-with-overly-aggressive-online-copyright-law 

  18. English translations of both are available respectively at http://www.cric.or.jp/english/clj/cl1.html and http://www.japaneselawtranslation.go.jp/law/detail_main?id=83&vm=2&re=

  19. Several years later this is still my go to method. While the amount of new definitions I actually add to my anki-sets has of course drastically decreased over time, the learning process never really stops and I still encounter new expressions or technical jargon on a daily basis; especially while reading non-fiction texts related to my study-field. 

  20. Or alternatively, to a .TXT format for further processing with Jnovelformatter. The end-result will be kinder to the eyes. JNovelFormatter is a neat little tool by the developer of Rikaisama9 that converts Japanese literature formatted as .TXT into cleanly parsed HTML-files. Layout is fairly customizable, although I think the original settings are easy on the eye enough as-is (I like dark backgrounds when reading for hours at a time, makes me feel less like I’m gazing straight into a light-bulb). End of Sentence dots are turned into book-markable anchors so you won’t lose track of your progress. 

  21. Although the Trans-Pacific Partnership (TPP) trade-agreement did not take effect after a United States withdrawal, discussions on copyright law concerning the TPP did however lead to a new definition of what concerns public domain in Japan. As a result, rather than 50 years, literature now falls into PD 70 years after the death of the author. This has led to a massive removal of literature on Aozora. 

  22. The practice of accumulating not yet learned vocabulary for creating flashcards. 

  23. From a technical point of view, this is based on one’s IP address. Although methods such as using VPNs or dynamic IP addresses have been popular means of bypassing (spoofing) geo-locks, many streaming services are aggressively blocking access through such means.