Category Archives: Tech Insights for Educators

Tech Insights for Educators #3: Intro to managing digital data

Although preservice teachers are often “digital natives,” meaning they grew up with and are familiar with common computing devices, they often are not actually proficient in using them for advanced purposes. In fact, there is a lot more to technology than mere social networking websites and apps. Educators often have many types of files and data to manage and share with their students and colleagues. It is important to establish productive practices that minimize wasting time and loss of data.

Digital data comes to us in many forms that are actually rather “siloed.” For example, typically we do not have a way to reliably search all our emails, text messages, and other digital data sources all at once. Further, while ideally, software would automatically keep a log of all edits made to a file (“versioning”), including a versatile user interface to navigate through prior edits if needed, in fact this is still uncommon and if available, often impractical to use. For example, the “Track Changes” function in Microsoft Office is satisfactory for sharing edits with colleagues, but you still have to create multiple copies of a file to keep track of prior edits that have been “accepted.”

The typical teacher or professor is working many hours per week, frequently in an office setting with Windows PCs provided by their institution. In fact, they often are not even able to install software or fully configure the software available to them. For this reason and others, for practicality it is necessary to manage digital data using commonly available practices and tools. Therefore, we are often restricted to file-level management and web tools.

Example List of Folders Organization and naming of one’s files and folders is still important to finding and maintaining data, in part because of Windows’ abysmal search functionality. Depicted right is a selection of folders for classes I have taken or been a teaching assistant for at University of Central Florida. Here, we see I have made many choices, with various pros and cons. I have chosen to organize my folders by course code, in a flat structure that includes many different semesters, thus avoiding navigating through many levels of subfolders. I have included other folders not directly related to classes, and have forced them to be displayed at the top of the list in Windows Explorer by prepending the folder names with symbols (here, “%”). For readability, I have used both capital and lowercase letters, and have included the year and semester subsequent to the course codes. I use hyphens instead of spaces because spaces in folder and file names can be problematic when data is directly uploaded to a web server. For classes I am currently enrolled in (not shown here), I prepend the folder names with “!” so they appear at the top of the list, and then remove the “!” at the end of the semester. Note that there is a tradeoff here—renaming a folder breaks shortcuts or links and forces many “dumb” applications (e.g., Google Drive, SyncBack) to delete and re-copy the entire folder.

Example List of Subfolders Here, we have one of my course subfolders (EME 6646, pictured right), which contains additional subfolders to organize files related to this class. The upper six folders consist of files provided by the professor, while the module folders contain more folders and files relating to each module’s materials and assignments. The “Outdated” folder contains an older version of the syllabus subsequently replaced by the professor, and the “PRIVATE” folder contains peer evaluation assignments I did not want to share with teammates. This course had a lot of group work, so I actually shared my files for this course (minus the private folder) with teammates via Google Drive.

I use a Windows application called SyncBackFree to backup data. The Google Drive desktop application allows you to automatically synchronize folders with your Google Drive, meaning that changes that are made on one will be synchronized on the other. For example, I could save a PDF file from my web browser on my mobile phone to a Google Drive folder and it would automatically be downloaded to my PC, or vice-versa. Google Drive also allows sharing folders with friends who can also be co-editors. I used these features along with a backup profile in SyncBackFree to duplicate changes on my flash drive (“X:”) to the Google Drive EME 6646 folder (excluding the private folder). SyncBackFree offers “synchronization” as well, but I stick to backup, meaning that I am careful to only edit files in one place rather than editing in two places and having to reconcile the differences. (In this case, my peers were not adding to the Google Drive on their own—if they were, a straight backup could overwrite their files if the option to delete data present in the “destination” but not the “source” was checked.)

In the EME 6646 course, moreover, we used Google Docs to collaboratively edit our group assignments, and then at the end, I would manually copy-and-paste into a Microsoft Word file and correct formatting prior to submitting the group assignment. Of course this is inefficient, but the professor would not accept Google Docs submissions, so it was more efficient than sending many different Word files back and forth.

One might ask why I did not just edit files directly in my PC’s Google Drive folder? Well, what if Google Drive does something weird, or my colleagues accidentally delete or damage a key file? Then, my Google Drive would automatically synchronize the undesirable changes, resulting in data loss. Backing up your data regularly is important to avoid data loss. However, when synchronization occurs in the background, without user input or reversibility, your data can easily be wiped out by an undesirable change in one place being automatically propagated to another place. This is why RAID-1 disk mirroring is not backup, although it protects against failure of a hard disk drive.

Going back to educators having office PCs at work that don’t allow them to install software, this is a reason I like working from home. I work on my class files, even when at home, directly from a flash drive, while running a SyncBackFree profile frequently to backup the flash drive to one of my PC’s hard disk drives. Then, I can work directly from the flash drive in the field while assuredly having the latest copy of my files. You could also use Google Drive via the web interface, but this has disadvantages too (e.g., must log-in to Google first, cannot save files directly to Google Drive, Internet problems, limited to 15 GB for free, latency). Another piece of the puzzle that I added a year ago is BackBlaze, which securely backs up all files on my home PC (except it will not backup flash drives) for $95 per two years (or, you can pay $5 per month). Although BackBlaze will not backup my flash drive directly, because I am using SyncBackFree to backup the flash drive to a hard disk drive, all my data still gets backed up to the cloud, which is literally multiple terabytes because I am also a photographer who shoots in RAW mode, even on my Samsung Galaxy S7 smartphone (BackBlaze offers unlimited backup, even though they lose money on customers like me).

Digital asset management is the buzzword particularly for managing many thousands of photos or other non-text data, but such software is too complex and unwieldy for educators’ purposes. Basically, working directly with files and folders is your main option for types of data such as PDFs, Word, Excel, et cetera. For note-taking, I use EverNote (with reservations), which is great for writing, searching, and sorting through text or even PDFs, images, and receipts if you pay for the premium version. I would not recommend taking notes in text, Microsoft Word files, or Gmail drafts because they are harder to search, manage, and/or synchronize. I have come very much to like taking notes in EverNote on my phone with a Bluetooth keyboard to type on.

One of my big shortcomings is my complete lack of familiarity with Apple products. Some of the issues I talk about here may be Windows-specific and the situation may be much better on Mac… but nevertheless, you are much more likely to be compelled to work on a PC than a Mac. Although it is dated and no longer updated, I use Locate32 to search through files on my PC, which is much faster and better than Windows or Cortana search. It is especially useful if you give descriptive filenames, which I recommend, although it takes practice to make filenames short, consistent, useful, and quick to devise.

Example List of Files When naming files and dealing with different versions, a simple and effective approach is to make a copy for each editing session and then save it with the date and time (“timestamp”) in the filename, as pictured left. If you are working in Adobe Photoshop, you might use non-destructive editing techniques such as layers or parametric editing instead. However, Microsoft PowerPoint and many other applications cater toward destructive editing (where data is deleted or overwritten during the editing process), so multiple files with timestamps is the simplest compromise. Then, you can revert or borrow elements from a prior version if needed. Although this takes up additional storage space and duplicates material, you could always archive the files in a 7-zip archive afterward. For example, the pictured files are 3.5 MB but only 1.5 MB in a .7z archive.

Note in this example that I have used YYYYMMDD-HHMM format for the timestamps. If you use MM-DD-YY format, all the files from the same month are clustered together, even if they are from different years! This is why formatting the timestamp in order of declining significance (i.e., year, month, day, hour, minute) is preferable. Then, files will be sorted chronologically. Further, I recommend using and becoming familiar with the 24-hour clock, because using a 12-hour clock would end up grouping files from 5 AM with files from 5 PM. You could use YYYY-MM-DD-HH-MM format for easier readability while maintaining chronological sort, albeit with slightly longer filenames. You could also use two-digit years, but I prefer all four digits because it makes it clearer that you are looking at a year, and will not be confusing in future centuries.

Here, I have put the timestamps toward the end of the filename. An alternate approach, more useful in some situations, is to put the timestamp right at the beginning. Then, all files in a folder will be sorted by timestamp. In other situations, you may want files to first be grouped by some other keyword(s), so the timestamp should be placed later in the filenames. Note that you can also use the “Date Modified” column in Windows to sort the files by when they were last edited. However, this option is not available in many settings (e.g., the Google Drive web interface).

While there are many other pieces to effectively managing your digital data (e.g., a ScanSnap ix500 scanner, LastPass or Dashlane, VeraCrypt or BitLocker, etc.), organizing and backing up individual files is a principal component and important starting point. Putting the time in to learn about this and establish good methods now will pay great dividends in the long run.

Tech Insights for Educators #2: The nature of digital data

What is digital data? Mainly, data that is represented discretely—that is, in steps—rather than continuously. For example, while a mercury thermometer can represent infinitesimally small variations in temperature, a digital thermometer would be limited to displaying a specific value. While a more expensive, more accurate thermometer might be accurate to several decimal places (e.g., 62.341 degrees), it still cannot be as continuous as the analog equivalent.

When a continuous, analog equivalent is available, why would we want to limit ourselves by representing a phenomenon digitally? There are actually many reasons! Digital data can be more compact, transmittable, faithfully reproducible, duplicable, and losslessly manipulable. For instance, a set of photographic prints take up a lot of physical space, cannot be transmitted, cannot be reproduced without loss of fidelity, is not easily duplicated, and cannot be manipulated without loss of data. A set of digital photographic files can be stored in as small a space as a microSD memory card (the size of a fingernail), can be easily transmitted via bus including over the Internet, can be reproduced with high fidelity, can be easily duplicated via digital copying, and can be manipulated easily and without data loss (e.g., by making a copy).

In common parlance, a digit can be any of 10 values: 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9. However, when we talk about digital data, we are almost always talking about binary data. A bit, or binary digit, can only take on two values, represented as (0) zero (“off”) and (1) one (“on”). Bits are the building blocks of all modern computing. Even something as complex as a high-definition motion picture or immersive, interactive video game can be represented, stored, and processed as billions of bits.

Since approximately 1993, bits have been organized into groups of eight, called bytes (before, a “byte” might have had a different number of bits, but now it is universally eight). When you see storage capacity listed for a USB flash drive, optical disc, hard or solid-state disk drive, smartphone, et cetera, it is listed in bytes. Because a byte has eight bits, it can take on one of 256 values. That is, the number of potential combinations for a byte is 2^8, which is 2 × 2 × 2 × 2 × 2 × 2 × 2 × 2 = 256, encompassing 00000000, 00000001, 00000010, … all the way to … 11111111.

This means a byte is enough space to store a typical character of text—for instance, the word “byte” can be represented by four bytes—one for each letter. You would need 26 different combinations to store all letters of the alphabet. If you add case sensitivity, you have to double this to 52 to be able to represent both “a” and “A,” “b” and “B,” et cetera. If we add digits 0–9, we now need 62 combinations. With 256 combinations, this leaves plenty of combinations for common symbols and punctuation. While there are characters requiring more than one byte to represent, because 256 combinations isn’t enough when you consider the vast range of symbols, typographical marks, or diacritical marks, eight bits is enough to represent most English text. In more complex text-editing environments (e.g., Microsoft Word), additional bytes are employed to represent other attributes such as font type, font size, and text style (e.g., bold, italics, underline).

If you have an essay of 5000 words in a simple text-editing environment, with an average word length of five characters, and, if we are generous and add two characters per word for spaces, line breaks, and punctuation, this gives 5000 × 7 = 35,000 bytes, or 280,000 bits. The 1981 Hayes Smartmodem could transmit 300 bits per second, so in 1981, our essay would take about 280,000 / 300 = 933 seconds to transmit (that is, just under 16 minutes). At the end of the dial-up era, transmission speeds in the United States improved to about 53,000 bits per second, which means our essay could be transmitted in just over five seconds. Modern Internet connections are asymmetric, meaning they download (receive) data faster than they can transmit (“send,” “upload”) data. As of 2013, the average United States Internet user can download 8,700,000 bits per second, and perhaps transmit 1,000,000 bits per second. Therefore, our 5000-word, 280,000-bit essay can now be transmitted in only 0.28 seconds! If we add time for network latency, which is basically limited by the speed of light, we can still transmit our essay in under a second, typically. This is simply impossible if the essay was represented in text on physical paper.

When talking about digital data, because we deal with such large numbers, it is necessary to introduce metric prefixes for ease of discussion and comprehension. That is, we talk about bytes and bits with prefixes that multiply them by factors of a thousand (“kilo”—kilobyte, kilobit), a million (“mega”—megabyte, megabit), a billion (“giga”—gigabyte, gigabit), or a trillion (“tera”—terabyte, terabit). Therefore, a megabit, commonly written as Mb or Mbit, is 1,000,000 bits (125,000 bytes). A megabyte, commonly written as MB, is 1,000,000 bytes (8,000,000 bits). Note that the lowercase “b” indicates a bit, while an uppercase “B” indicates a byte, which is eight bits.

Typically, network transmission speeds are discussed in bits, while storage capacity is discussed in bytes. A common Internet connection speed is asymmetric, with 10 Mb/sec downstream and 1 Mb/sec upstream, meaning that 10 Mb (1.25 MB) of data can be downloaded (received) per second, and 1 Mb (125 KB) of data can be uploaded (transmitted) per second. The Samsung Galaxy S8 smartphone comes with 64 GB of internal nonvolatile storage, meaning that it can store 64 billion bytes (512 billion bits). The latest microSD memory cards can reliably store 256 GB in an area smaller than a thumbnail, which is 2.048 trillion bits (2.048 Tb)!

Digital data can also be compressed. For example, our 35 KB essay has patterns in it which can be stored more succinctly. Doing so requires more computing power to encode and decode, but might reduce the amount of space needed to represent the essay to 10 KB. When dealing with text, this would be a lossless operation, meaning the compression results in no loss of fidelity when reversed (expanded or “decoded”). For example, the HTML, or hypertext markup language that is the foundation of this webpage, is losslessly compressed using “gzip” before being transmitted to, and subsequently decoded by, your web browser.

When we represent complex data such as audio, still photographs, and videos digitally, compression is vital, almost universal, and more commonly lossy, meaning that data in unimportant areas is permanently discarded to save storage space. If you remember the old days of audio compact-discs (CDs), they could only store 74 or 80 minutes of audio because they weren’t compressed. However, through a lossy compression mechanism known as MP3, you could store 10 hours of music on a CD! Similarly, JPEG is the most common method of lossily compressing digital photographs, and H.264 is a leading way to lossily compress digital audiovisual materials. While lossless compression formats exist for audio, images, and video, particularly with video, the space requirements are tremendous, which is why lossy compression algorithms are used to simplify and discard data in areas likely to be unimportant. For example, in a photograph with dark areas, JPEG encoding discards data in the dark areas because you are unlikely to see it. But, if you were to brighten the image, this data loss would become abundantly apparent! (Pictured right in example below—photograph by Richard Thripp.)

JPEG artifacts in shadows pictured left

Most lossy compression algorithms, and even lossless compression algorithms, let you specify the degree of compression. If you want to save more space, you can choose to do so. However, with lossy algorithms, you will lose fidelity, and with lossless algorithms, although no fidelity will be lost, more computational power will be required to compress and decompress the data.

Humans cannot actually listen to audio nor view a photograph or video in binary format. When you view a digital image, you are actually seeing an analog representation of that image. Mainly, this means it could look different depending on the device or medium of presentation. For example, a digital image displayed on a computer monitor may look different than when displayed on a smartphone, or printed on paper. However, the digital data itself remains the same and can be duplicated without loss of data. In the old days, we would have an analog “master” copy of an audio recording, still image, or video that would be duplicated with loss of fidelity. Then, when that master copy wore out from being frequently duplicated, we might be limited to duplicating a copy of the master copy, and eventually a copy of a copy of a copy, with declining quality each time. For example, security cameras often used to use analog tape that would be recorded and re-recorded ad nauseam, causing the tape to degrade. If the tape was not replaced regularly, shoplifters might appear on the tape as a useless, fuzzy blob. Digital recording largely eliminates this type of problem. (Although repeatedly subjecting digital data to a lossy encoding algorithm produces similar effects, the master copy itself does not degrade by being accessed or duplicated—unless you erase it!)

Digital data, particularly when compressed, is more fragile than analog data. For example, if the signal was bad, analog television transmissions often had noise or “snow,” but could still be watched. However, digital television transmissions stutter or are completely unwatchable if the signal is bad.

Intuitively, it makes sense that uncompressed digital data is more resilient than compressed digital data, meaning that we could lose part of the data and still be able to view the rest of it. For example, if we lost part of our 35 KB essay file, we could still read the rest of it. However, if we compress it to 10 KB, the compression algorithm might require all of those 10 kilobytes to be present to produce readable output. In fact, the more powerful the compression, the more likely that every bit is required to produce any usable output, because of how efficiently and intricately the data is compressed. Moreover, if we lose or forget how the algorithm to decompress the data, we are lost! Nevertheless, compression is necessary, valuable, and relatively safe if we stick with popular and mainstream formats.

Although a byte has eight bits, it can be more useful to represent it as a number using all 10 digits, or as a “hexadecimal” code. While you would think a base-10 representation would be numbered 1–256, in fact, counting from (0) zero is the prevailing practice, so we would represent the binary byte 00000000 as 0, 00000001 as 1, 00000010 as 2, 10000000 as 128, 11110000 as 240, and 11111111 as 255. In contrast to base-10, hexadecimal extends base-10 to base-16, giving us 16 combinations to work with in one character instead of 10. While in base-10, 9 is the 10th and final character, hexadecimal extends this by making A the 11th character, B the 12th character, C the 13th character, D the 14th character, E the 15th character, and F the 16th character. Therefore, 0 (00000000) is 00 and 255 (11111111) is FF in hexadecimal.

It is very common to represent colors in hexadecimal, three-byte R–G–B format. Here, 16,777,216 colors (2^24) can be represented hexadecimally with only six characters, representing 24 bits. R, G, and B stand for red, green, and blue (the three additive primary colors), with higher values indicating brighter colors. In a six-character hexadecimal color code, Characters 1–2 represent red, Characters 3–4 represent green, and Characters 5–6 represent blue. FF is the highest intensity, while 00 is the lowest intensity. Thus, pure red would be FF0000, pure green would be 00FF00, pure blue would be 0000FF, pure white would be FFFFFF, and pure black would be 000000.

Twenty-four bits per pixel is considered a “true color” image. However, if we were to store a photograph from a 15-megapixel (MP) digital camera in true color without compression, we would need three bytes per pixel, or 45 MB! JPEG compression is essential for reducing this to a more manageable filesize of approximately 2–5 MB.

While this was by no means an exhaustive discussion of digital data and focused primarily on capacity, representation, and compression rather than other concerns such as storage, volatility, latency, transmission, processing, and encryption, nonetheless, you should now have a grasp of the fundamental underpinnings of the digital world.

Tech Insights for Educators #1: Special typographic characters and alt codes

This is the first in a new series of Technology Insights for Educators which I will use as supplemental materials for my students in EME 2040: Introduction to Technology for Educators at University of Central Florida, which may also be of general interest. As I enter my second year of the Education Ph.D., Instructional Design and Technology program, I am becoming a Graduate Teaching Associate and will be teaching two mixed mode sections of EME 2040 (Monday 10:30 A.M. – 1:30 P.M. and Wednesday 1:30 – 4:20 P.M.) as Instructor of Record in Fall 2017. At a later time, I will make a landing or index page for these insights.


When preparing documents, et cetera, there are many typographic characters that are not available on a standard keyboard, and yet are supported by Unicode and can be used in most applications (e.g., Microsoft Office).

On Microsoft Windows, if a numeric keypad is available (found on the right side of the keyboard), such characters can be directly typed with alt codes. With the Num Lock key enabled, one should depress one of the Alt keys, and while doing so, type a sequence of numbers on the numeric keypad, and then release Alt. Then, the special character will appear. I found a list of many alt codes in this blog post by “Techno World 007.” Here are some of the most important ones:

Symbol Alt Code Description
Alt + 0149 Bullet point
Alt + 0150 En dash
Alt + 0151 Em dash
¢ Alt + 0162 Cent sign
° Alt + 0176 Degree symbol
× Alt + 0215 Multiplication sign
÷ Alt + 0247 Division sign
* Alt + 8242 Prime symbol
* Alt + 8243 Double prime symbol

* Alt code works in Microsoft Office, but not most other programs.

If a numeric keypad is unavailable (e.g., on a laptop), or you are in a non-Windows environment, there are other options. In Microsoft Word, there is the “symbol” section. Another option is simply copying-and-pasting the symbol into the target document. In Microsoft Word, this should be done with the “keep text only” paste option to prevent inheriting conflicting font size or formatting from the source.

What you see in many academic manuscripts, books, and other materials is frequently incorrect. Using a hyphen between a number range (e.g., 10-99) is not correct—an en dash should be used (e.g., 10–99). When an author speaks of a two-by-two interaction, calling it a 2*2 or 2x2 is typographically incorrect. Instead, the multiplication sign should be used (i.e., 2×2). When talking about height or distance, one should use the prime and double-prime symbols, rather than the single and double-quote symbols, respectively (e.g., not 5’10”, but rather, 5′10″).

In some cases, Microsoft Word will help you. For example, if you type two hyphens between words, it automatically converts the two hyphens to an em dash (—).

Personally, I am so used to using some of the symbols that I have memorized the alt codes for an en dash, an em dash, the cent sign, and the multiplication sign (–, —, ¢, ×). This way, when I am typing in an online discussion, et cetera, and must employ these symbols to be typographically correct, there is no need for me to copy-and-paste from an external source or consult a character map.

You can impress or annoy your colleagues with your knowledge of typography. Surprisingly, I have found that knowledge of the en dash, in particular, is sparse. Most people, including full professors, incorrectly use hyphens where en dashes are required. I suppose many academic journals correctly employ en dashes only because the editors make corrections to the authors’ manuscripts.