tag:blogger.com,1999:blog-221750272024-02-19T15:14:28.924+08:00S.-F. Yang's Blog in EnglishThis is my free style blog in English.<br />
There is another <a href="http://sfyang.blogspot.com/">Traditional Chinese version</a>.Shang-Feng Yanghttp://www.blogger.com/profile/06302302011922621920noreply@blogger.comBlogger12125tag:blogger.com,1999:blog-22175027.post-7546064225519468012012-04-07T17:38:00.001+08:002012-04-07T18:12:38.573+08:00URL encoding/decoding with shell scriptQuite some time ago, I wrote a simple shell script called "<b><i>urldecode</i></b>" which decodes the "escaped", or URL-encoded, string using the "<b>printf</b>" utility from GNU coreutils. However, today when I tried to write a shell script to generate a short URL using <a href="http://tinyurl.com">tinyurl.com</a>, I face the problem to have a string to be URL-encoded. So, after reading the page "<a href="http://en.wikipedia.org/wiki/Percent-encoding">Percent-encoding</a>" on Wikipedia, I finished my "<b><i>urlencode</i></b>" script.<br />
<br />
<span class="fullpost"><br />
Let me talk about the decoding part first. Decoding an URL-encoded string is relatively simple. Since the "<b>printf</b>" utility accepts the "\xHH" format string, where "HH" is 1 to 2 digits of a byte with hexadecimal value, the only necessary pre-processing for the target string would be replacing the '%' characters in the string into '\x' strings. After that, just pass the processed string to "<b>printf</b>" to get the converted string. The following code is my implementation of the above-mentioned process:<br />
<br />
<blockquote class="code"><br />
#!/bin/bash<br />
#<br />
# urldecode - decoding the URL-encoded string<br />
#<br />
# (C)2010 Shang-Feng Yang <storm_DOT_sfyang_AT_gmail_DOT_com><br />
#<br />
# License: GPLv3<br />
<br />
ENC_STR=$@<br />
[ "${ENC_STR}x" == "x" ] && {<br />
TMP_STR="$(cat - | sed -e 's/%/\\x/g')"<br />
} || {<br />
TMP_STR="$(echo ${ENC_STR} | sed -e 's/%/\\x/g')"<br />
}<br />
PRINTF=/usr/bin/printf<br />
exec ${PRINTF} "${TMP_STR}\n"<br />
</blockquote><br />
The "<b>urldecode</b>" script can read the string from either STDIN or the script calling argument. This script has an obvious shortcoming that, since the whole string is passed as the format string to the "<b>printf</b>" utility, the operation will fail if the length of the encoded string is too long.<br />
<br />
For the encoding part, it becomes a little more complicated. At first, I was thinking about finding the reserved characters, escaping them, and then replacing the original characters with the escaped one. For that purpose, I wrote a short script to find the corresponding ASCII byte value of a given character, called "<b>char2hex</b>":<br />
<br />
<blockquote class="code"><br />
#!/bin/bash<br />
#<br />
# char2hex - returning the hexadecimal value of the given characters<br />
#<br />
# (C)2012 Shang-Feng Yang <storm_DOT_sfyang_AT_gmail_DOT_com><br />
#<br />
# License: GPLv3<br />
<br />
function usage() {<br />
echo -e "Usage:\n"<br />
echo -e "\t$(basename $0) CHARACTER(S)_TO_CONVERT\n"<br />
}<br />
<br />
CHAR=$1<br />
<br />
[ "x${CHAR}" == "x" ] && { usage; exit 1; }<br />
<br />
echo -n "${CHAR}" | od -A n -t x1 | tr -d ' '<br />
</blockquote><br />
This script is quite straight-forward. The only thing that is worth-mentioned is the reason for the '-n' option to the "<b>echo</b>" command. By default, "<b>echo</b>" will append a newline character to what it printed, so you will get an "additional" "0a" from the output. The '-n' option turns off this behavior.<br />
<br />
This approach seems to be relatively elegant and simple, but the implementation could potentially be a nightmare. For one thing, it could be because I'm not smart enough, but I can not figure out a simple way to "pick up" and pass to the "<b>char2hex</b>" script the reserved characters from the input string or input stream by using simple shell syntax or simple utilities. It either could take too much effort to just do that, or the efficiency of the script could be quite low due to heavy I/O. It is apparently not an acceptable way to do this kind of thing for such a lazy guy like me.<br />
<br />
After reading both the sections "<a href='http://en.wikipedia.org/wiki/Percent-encoding#Percent-encoding_reserved_characters'>Percent-encoding reserved characters</a>" and "<a href='http://en.wikipedia.org/wiki/Percent-encoding#Percent-encoding_the_percent_character'>Percent-encoding the percent character</a>" from the Wikipedia page "<a href='http://en.wikipedia.org/wiki/Percent-encoding'>Percent-encoding</a>", I found that the reserved characters that need to be encoded are not much, so it is practical to implement the "encoding" by using the "lookup table" method. So, the solution is stupid but simple:<br />
<br />
<blockquote class="code"><br />
#!/bin/bash<br />
#<br />
# urlencode - escaping the reserved characters using URL-encoding<br />
#<br />
# (C)2012 Shang-Feng Yang <storm_DOT_sfyang_AT_gmail_DOT_com><br />
#<br />
# License: GPLv3<br />
<br />
STR=$@<br />
[ "${STR}x" == "x" ] && { STR="$(cat -)"; }<br />
<br />
echo ${STR} | sed -e 's| |%20|g' \<br />
-e 's|!|%21|g' \<br />
-e 's|#|%23|g' \<br />
-e 's|\$|%24|g' \<br />
-e 's|%|%25|g' \<br />
-e 's|&|%26|g' \<br />
-e "s|'|%27|g" \<br />
-e 's|(|%28|g' \<br />
-e 's|)|%29|g' \<br />
-e 's|*|%2A|g' \<br />
-e 's|+|%2B|g' \<br />
-e 's|,|%2C|g' \<br />
-e 's|/|%2F|g' \<br />
-e 's|:|%3A|g' \<br />
-e 's|;|%3B|g' \<br />
-e 's|=|%3D|g' \<br />
-e 's|?|%3F|g' \<br />
-e 's|@|%40|g' \<br />
-e 's|\[|%5B|g' \<br />
-e 's|]|%5D|g'<br />
</blockquote><br />
The "<b>urlencode</b>" script is too simple for me to explain it. It also accepts the target string from either STDIN or the command argument. The following demonstrates the usage of the scripts:<br />
<br />
<blockquote class="terminal"><br />
$ urlencode http://en.wikipedia.org/wiki/Percent-encoding<br />
http%3A%2F%2Fen.wikipedia.org%2Fwiki%2FPercent-encoding<br />
$ echo 'http://en.wikipedia.org/wiki/Percent-encoding' |urlencode <br />
http%3A%2F%2Fen.wikipedia.org%2Fwiki%2FPercent-encoding<br />
$ urldecode $(urlencode http://en.wikipedia.org/wiki/Percent-encoding)<br />
http://en.wikipedia.org/wiki/Percent-encoding<br />
$ urlencode http://en.wikipedia.org/wiki/Percent-encoding |urldecode<br />
http://en.wikipedia.org/wiki/Percent-encoding<br />
</blockquote><br />
PS. Due to my "upgrading" the old template into new one, there are some formatting error in the code and terminal blocks... I <i>probably</i> will fix them by modifying underlying CSS of the new template in the future if I got enough motivation...<br />
</span>Shang-Feng Yanghttp://www.blogger.com/profile/06302302011922621920noreply@blogger.com0tag:blogger.com,1999:blog-22175027.post-82983174289200753162011-11-08T22:27:00.000+08:002011-11-09T21:07:00.841+08:00"Speaking Mandarin Chinese" in HollywoodIt's quite often to see some scenes in either TV shows or movies that the characters speak something they claim to be "Mandarin Chinese". Some characters even claim to be very fluent in it. However, for a <b>native Traditional Chinese speaker from Taiwan</b> like me, most of the time, those so called Chinese on screen can hardly be understandable, if it could be understood at all.<br />
<br />
It is quite strange, since there should be lots of native Chinese speakers near the production locations of these shows or movie. Is it that hard to find a decent language consultant to make sure the proper pronunciation of a few lines? Or the Hollywood just too proud to admit the fact that, they can't do it right when they are so self-centered and so used to laugh at those whom didn't speak proper English? It's quite painful to hear a character you love to speak something that has nothing resemble to what they claim to be, if that "thing" could be called a language at all.<br />
<br />Shang-Feng Yanghttp://www.blogger.com/profile/06302302011922621920noreply@blogger.com0tag:blogger.com,1999:blog-22175027.post-30633465897984777112011-11-05T20:50:00.000+08:002011-11-05T20:57:36.808+08:00Grabbing the vanity card of TBBT into an imageThe producer of the TV show "<a href="http://en.wikipedia.org/wiki/The_Big_Bang_Theory">The Big Bang Theory</a>", <a href="http://en.wikipedia.org/wiki/Chuck_Lorre">Mr. Chuck Lorre</a>, always shows the <a href="http://en.wikipedia.org/wiki/Vanity_Plate">vanity card</a> in the end of each episode. He also <a href="http://chucklorre.com/index-bbt.php">posts the same cards on his own website</a> along with those for other shows he produced.<br />
<br />
Recently, for some reason, I would like to attach as an image in a e-mail the vanity card for a specific episode of the show from the website. I prefer the image to only contain the content of the card rather than the whole page. This, of course, could be done with screen capturing and cropping of the image using something like GIMP or ImageMagick. However, since I'm a lazy guy, and the chance that I will do this more than once is quite high, <b>manually</b> screen capturing and cropping is certainly not an option for me. Fortunately, I have some ideas on how to do this <b>automatically</b>.<br />
<br />
<span class="fullpost"><br />
To grab the web page into an image on command line, there are lots of possible ways to do this. The <i>weapon</i> of choice is the <i>still-buggy-but-quite-useful</i> <b>wkhtmltoimage</b> from the project <a href="http://code.google.com/p/wkhtmltopdf/">wkhtmltopdf</a>. wkhtmltoimage uses WebKit and Qt to render a given page directly into an image. The great thing about this tool is that, it supports CSS and JavaScript from the page, while you can replace the CSS with your own version and can also append some JavaScripts before rendering happens.<br />
<br />
At first, I was trying to render the page into an image, and then pass the image into ImageMatick's <b>convert</b> to cut out only the block of the "vanity card" in the page. However, this approach was proven to be problematic, since it is hard to automatically determine the cropping parameters needed for the "<b>-crop</b>" option of convert. After inspecting the HTML and CSS sources of the page, I decided to experiment with the "visibility" attribute in the CSS definition. I downloaded the CSS file, set the "visibility" attribute to "hidden" for the top most selector (the "#container" selector block in this case), turned on the visibility only for the "#content" block, and supplied the customized CSS to wkhtmltoimage. This gave me an rendered image that only shows the "card" block in the center of a white background. The white "border" then can be easily removed using the "<b>-trim</b>" option of convert.<br />
<br />
Although the downloading-and-modifying-CSS approach was a success, supplying a whole modified CSS to wkhtmltoimage is not elegant and could have some potential side-effects. Therefore, the better approach is taking advantage of the ability for wkhtmltoimage to run JavaScripts to alter the "visibility" attribute for appropriate selectors after the page is done loading. Here is my final "one-liner" solution to my problem:<br />
<br />
<blockquote class="terminal"><br />
$ wkhtmltoimage \<br />
--run-script "document.getElementById('container').style.visibility='hidden';" \<br />
--run-script "document.getElementById('content').style.visibility='visible';" \<br />
http://chucklorre.com/index-bbt.php?p=364 - \<br />
| convert - -trim tbbt.jpg<br />
</blockquote><br />
The generated JPEG image, "tbbt.jpg", only contains the "card" I want.<br />
<br />
The principle behind this could also be applied to other pages. I, as usual, wrote a script to save me some typing that can take an optional production number argument to grab the card for an specific episode. However, since it is an very simple script, I won't bother to post the code here...<br />
</span>Shang-Feng Yanghttp://www.blogger.com/profile/06302302011922621920noreply@blogger.com0tag:blogger.com,1999:blog-22175027.post-28068144909619695822011-10-30T12:39:00.000+08:002011-10-30T12:40:50.543+08:00ann2srt v0.3Although all the bug fixing, testing, and cleaning up have been done several days ago, I was a little too lazy to write... Anyway, here is the "<i>official release notice</i>" of ann2srt version 0.3.<br />
<br />
Thanks to the commenter L who helped me on testing and debugging the script on Cygwin, version 0.3 of ann2srt now can handle the annotations other than Traditional Chinese language that have newlines and commas in them, and also can run correctly under Cygwin environment on Win32 platform.<br />
<br />
<span class="fullpost"><br />
Due to the fact that version 0.2 script uses <a href="http://en.wikipedia.org/wiki/Comma-separated_values">CSV (Comma-Separated Values)</a> as an intermediate format, the version 0.2 script will fail if the annotation has newline or comma in it. To fix this, in version 0.3, <b>tr</b> is used to eliminate newlines in the annotation. To address the "comma" problem, the delimiter for the intermediate stream is changed from comma to "|".<br />
<br />
The version 0.2 script, technically speaking, should be able to run correctly without any modification under Cygwin environment. However, since Windows uses "DOS style" newline characters that consists CR+LF, if any of the external programs used in the script were Win32 binary, or if the input annotation file was in DOS format, the execution of the script becomes unpredictable. To fix this, <b>tr</b> is used again to convert the annotation and the output of the Win32 XMLStarlet from DOS format into UNIX format.<br />
<br />
Let's cut to the chase. Here is the source of the version 0.3 script:<br />
<blockquote class="code"><br />
#!/bin/bash<br />
#<br />
# Convert the youtube annotation into SRT subtitle<br />
#<br />
# By Shang-Feng Yang <storm_dot_sfyang_at_gmail_dot_com><br />
# Version: 0.3<br />
# License: GPL v3<br />
#<br />
# Changelog:<br />
# * v0.3 (Oct/19/2011):<br />
# - Fix the parsing errors caused by comma and newline characters in <br />
# some English annotations<br />
# - Adding transparent dos2unix conversion for compatibility under Cygwin<br />
# * v0.2 (Jan/19/2011):<br />
# - Sort the annotations using the "begin" time as key<br />
# - Minor bugs fixing<br />
# * v0.1 (Dec/7/2010):<br />
# - Initial release<br />
<br />
<br />
ANN=$1<br />
SRT=$(basename ${ANN} .xml).srt<br />
IFS=$'\n'<br />
I=0<br />
<br />
function usage() {<br />
echo -e "Usage:\n"<br />
echo -e "\t$(basename $0) ANNOTATION_FILE\n"<br />
}<br />
<br />
function parseXML() {<br />
cat ${ANN} | tr -d '\r' |tr '\n' ' ' | xmlstarlet sel -t -m 'document/annotations/annotation' -v 'TEXT' -o '|' -m 'segment/movingRegion/rectRegion' -v '@t' -o '|' -b -n | tr -d '\r'<br />
}<br />
<br />
function reformatTime() {<br />
local H=$(echo $1 |cut -d ':' -f 1)<br />
local M=$(echo $1 |cut -d ':' -f 2)<br />
local S=$(echo $1 |cut -d ':' -f 3)<br />
printf '%02d:%02d:%06.3f' ${H} ${M} ${S} |tr '.' ','<br />
}<br />
<br />
function time2sod() {<br />
# Convert time in HH:MM:SS.SSS format into second-of-the-day value<br />
local SOD=$(echo $1 | awk -F ":" '{printf("%f\n", $1*3600+$2*60+$3);}')<br />
<br />
echo ${SOD}<br />
}<br />
<br />
[ "x${ANN}" = "x" ] && { usage; exit 1; }<br />
[ -f ${ANN} ] || { usage; exit 1; }<br />
[ -f ${SRT} ] && rm ${SRT}<br />
[ -f ${SRT}.tmp ] && rm ${SRT}.tmp<br />
<br />
for LINE in $(parseXML); do<br />
C=$(echo ${LINE} |cut -d '|' -f 1)<br />
B=$(echo ${LINE} |cut -d '|' -f 2)<br />
E=$(echo ${LINE} |cut -d '|' -f 3)<br />
echo "$(time2sod ${B})#${B}#${E}#${C}" >> ${SRT}.tmp<br />
done<br />
<br />
grep "###" ${SRT}.tmp && {<br />
echo "\"${ANN}\" has no valid annotation!" >&2<br />
rm ${SRT}.tmp<br />
exit 1<br />
}<br />
<br />
for LINE in $(cat ${SRT}.tmp|sort -n -t '#'); do<br />
(( I++ ))<br />
C=$(echo ${LINE} |cut -d '#' -f 4)<br />
B=$(reformatTime $(echo ${LINE} |cut -d '#' -f 2))<br />
E=$(reformatTime $(echo ${LINE} |cut -d '#' -f 3))<br />
echo -e "${I}\n${B} --> ${E}\n${C}\n" >> ${SRT}<br />
done<br />
<br />
rm ${SRT}.tmp<br />
<br />
</blockquote><br />
The version 0.3 script can also be downloaded from here to avoid typos caused by copy-and-paste:<br />
<a href="http://en.wikipedia.org/wiki/Comma-separated_values">http://dl.dropbox.com/u/1382119/tmp/ann2srt</a><br />
<br />
In fact, I just found that the customized "code block" loses all indentations after the blogger updates. Please download the correct script from the link above.<br />
</span>Shang-Feng Yanghttp://www.blogger.com/profile/06302302011922621920noreply@blogger.com8tag:blogger.com,1999:blog-22175027.post-80208423402090870562011-01-20T03:50:00.003+08:002011-01-20T11:49:01.407+08:00ann2srt v0.2Last time in my post "<a href="http://sfyang-en.blogspot.com/2010/12/converting-youtubes-annotation-into-srt.html">Converting Youtube's annotation into SRT subtitle</a>, I released a bash script called "<b>ann2srt</b>" v0.1. Version 0.1 was a pretty crude one that did not deal with the sorting of the subtitles in SRT file, and could possibly be problematic for some SRT parser. Yesterday, I spent some time to improve the script with the sorting functionality, and also fixed some minor bugs in v0.1.<br /><span class="fullpost"><br /><br />The sorting is achieved by using <b>awk/gawk</b> to convert the "beginning" time of the annotation into seconds and then passing the results into <b>sort</b> for sorting. Since <b>sort</b> is part of the GNU coreutils, and <b>awk/gawk</b> should be installed on most of the distributions, this change should not be a big deal for most people.<br /><br />Here is the code for v0.2:<br /><blockquote class="code"><br />#!/bin/bash<br />#<br /># Convert the youtube annotation into SRT subtitle<br />#<br /># By Shang-Feng Yang <storm_DOT_sfyang_AT_gmail_DOT_com><br /># Version: 0.2<br /># License: GPL v3<br />#<br /># Changelog:<br /># * v0.2 (Jan/19/2011):<br /># - Sort the annotations using the "begin" time as key<br /># - Minor bugs fixing<br /><br />function usage() {<br /> echo -e "Usage:\n"<br /> echo -e "\t$(basename $0) ANNOTATION_FILE\n"<br />}<br /><br />function parseXML() {<br /> cat ${ANN} |xmlstarlet sel -t -m 'document/annotations/annotation' -v 'TEXT' -o ',' -m 'segment/movingRegion/rectRegion' -v '@t' -o ',' -b -n<br />}<br /><br />function reformatTime() {<br /> local H=$(echo $1 |cut -d ':' -f 1)<br /> local M=$(echo $1 |cut -d ':' -f 2)<br /> local S=$(echo $1 |cut -d ':' -f 3)<br /> printf '%02d:%02d:%06.3f' ${H} ${M} ${S} |tr '.' ','<br />}<br /><br />function time2sod() {<br /> # Convert time in HH:MM:SS.SSS format into second-of-the-day value<br /> local SOD=$(echo $1 | awk -F ":" '{printf("%f\n", $1*3600+$2*60+$3);}')<br /><br /> echo ${SOD}<br />}<br /><br />ANN=$1<br />SRT=$(basename ${ANN} .xml).srt<br />IFS=$'\n'<br />I=0<br /><br />[ "x${ANN}" = "x" ] && { usage; exit 1; }<br />[ -f ${ANN} ] || { usage; exit 1; }<br />[ -f ${SRT} ] && rm ${SRT}<br />[ -f ${SRT}.tmp ] && rm ${SRT}.tmp<br /><br />for LINE in $(parseXML); do<br /> C=$(echo ${LINE} |cut -d ',' -f 1)<br /> B=$(echo ${LINE} |cut -d ',' -f 2)<br /> E=$(echo ${LINE} |cut -d ',' -f 3)<br /> echo "$(time2sod ${B})#${B}#${E}#${C}" >> ${SRT}.tmp<br />done<br /><br />grep "###" ${SRT}.tmp && { <br /> echo "\"${ANN}\" has no valid annotation!"<br /> rm ${SRT}.tmp<br /> exit 1<br />}<br /><br />for LINE in $(cat ${SRT}.tmp|sort -n -t '#'); do<br /> (( I++ ))<br /> C=$(echo ${LINE} |cut -d '#' -f 4)<br /> B=$(reformatTime $(echo ${LINE} |cut -d '#' -f 2))<br /> E=$(reformatTime $(echo ${LINE} |cut -d '#' -f 3))<br /> echo -e "${I}\n${B} --> ${E}\n${C}\n" >> ${SRT}<br />done<br /><br />rm ${SRT}.tmp<br /></blockquote><br /><br />The usage should be the same with v0.1.<br /></span>Shang-Feng Yanghttp://www.blogger.com/profile/06302302011922621920noreply@blogger.com18tag:blogger.com,1999:blog-22175027.post-55602673261340220652010-12-08T09:21:00.006+08:002010-12-18T00:23:55.088+08:00Converting Youtube's annotation into SRT subtitleIt has been a <span style="font-style: italic;">long time</span> since my last blog. Well, I'm a lazy guy, and English is apparently not my native language. Besides, there were lots of things that weren't exciting enough for me to write a long article on the blog, so I usually write short comments on the my Buzz instead.<br /><br />Any way, let's cut to the chase.<br /><br />These days, more and more people like to use annotation to add "subtitles" onto Youtube videos rather than to use caption. There already are lots of on-line/off-line "Youtube downloaders" that can download either videos, the corresponding captions, or both of them at once, such as get_flash_videos, clive, youtube-dl, Google2SRT, and <a href="http://mike.thedt.net/ytsubs/ytsubs.php">Youtube Subtitle Ripper</a>, etc. However, there is not much information available about how to download the annotations and convert them into SRT subtitles. Today, I found the solution.<br /><br /><span class="fullpost"><br />First of all, I found <a href="http://googlesystem.blogspot.com/2010/10/download-youtube-captions.html?showComment=1287097970974#c668286111028568286">this comment on the blog post</a> about how to download the annotations in XML format. And yes, I do write a script to download the caption and annotation using <span style="font-weight: bold;">wget</span>, but it is a simple script that is not worth to mention. After downloading the annotation in XML, next step would be converting it into some subtitle format.<br /><br />Although there are many subtitle formats available, and the converting algorithm is possibly existing in the Google2SRT source code, I decide to write my own bash script that converts the XML into the SRT format, which is one of the simplest subtitle format.<br /><br />The script I wrote, called <span style="font-weight: bold;">ann2srt</span>, uses the <a href="http://xmlstar.sourceforge.net/"><span style="font-weight: bold;">XMLStarlet</span></a> as the XML parsing tool. Other than that, the script only uses the bash built-ins and coreutils like <span style="font-weight: bold;">cut</span> and <span style="font-weight: bold;">tr</span>. For now, the generated SRT could have some compatibility problems with some players. This is because the annotations in the XML are not in chronicle order. Adding the sorting is possible, but since mplayer can handle the out-of-order subs correctly, I'll leave it this way <span style="font-style: italic;">for now</span>. Here is the code of <span style="font-weight: bold;">ann2srt</span>:<br /><br /><blockquote class="code"><br />#!/bin/bash<br />#<br /># Convert the youtube annotation into SRT subtitle<br />#<br /># By Shang-Feng Yang <storm_dot_sfyang_at_gmail_dot_com><br /># Version: 0.1<br /># License: GPL v3<br /><br />function usage() {<br /> echo -e "Usage:\n"<br /> echo -e "\t$(basename $0) ANNOTATION_FILE\n"<br />}<br /><br />function parseXML() {<br /> cat ${ANN} |xmlstarlet sel -t -m 'document/annotations/annotation' -v 'TEXT' -o ',' -m 'segment/movingRegion/rectRegion' -v '@t' -o ',' -b -n<br />}<br /><br />function reformatTime() {<br /> H=$(echo $1 |cut -d ':' -f 1)<br /> M=$(echo $1 |cut -d ':' -f 2)<br /> S=$(echo $1 |cut -d ':' -f 3)<br /> printf '%02d:%02d:%02.3f' ${H} ${M} ${S} |tr '.' ','<br />}<br /><br />ANN=$1<br />SRT=$(basename ${ANN} .xml).srt<br />IFS=$'\n'<br />I=0<br /><br />[ -f ${ANN} ] || { usage; exit 1; }<br />[ -f ${SRT} ] && rm ${SRT}<br /><br />for LINE in $(parseXML); do<br /> (( I++ ))<br /> C=$(echo ${LINE} |cut -d ',' -f 1)<br /> B=$(echo ${LINE} |cut -d ',' -f 2)<br /> E=$(echo ${LINE} |cut -d ',' -f 3)<br /> echo -e "${I}\n$(reformatTime ${B}) --> $(reformatTime ${E})\n${C}\n" >> ${SRT}<br />done<br /></blockquote><br /><br /><span style="font-style: italic;">A sidenote for <span style="font-weight: bold;">mplayer</span> users</span>: When playing videos with subs generated by this script, remember to turn on the SSA/ASS support by using the "-ass" option. Due to the nature of the annotations, it is possible that several annotations occupy the same time period, and the built-in SRT parser of mplayer will only show one of them, while they will be stacked when -ass is enabled.<br /><br />SRT is a quite simple format that did not support any special effect, of which the annotations possess such as position and color of the annotations. The next version of the script will be one that converts the annotations into SSA/ASS format -- only if I have the motive to improve it...<br /></span>Shang-Feng Yanghttp://www.blogger.com/profile/06302302011922621920noreply@blogger.com7tag:blogger.com,1999:blog-22175027.post-17342861328850307482008-02-02T15:58:00.000+08:002008-02-02T16:41:36.364+08:00Rotate the Cube in Compiz with wmctrl<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjlS5d5ea7ersKO2KvYalpELW8R51NUM5gjrh2FzjZE1gpKcY0NH7KMfpXKsvS4tNIu1cMAEXIMlmKw2r1cKlmOHJdUamG8U8by2thqssOHmRTFLBEvq8x8ol2lW-rHYUxy1r7NTQ/s1600-h/Cube01.jpg"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjlS5d5ea7ersKO2KvYalpELW8R51NUM5gjrh2FzjZE1gpKcY0NH7KMfpXKsvS4tNIu1cMAEXIMlmKw2r1cKlmOHJdUamG8U8by2thqssOHmRTFLBEvq8x8ol2lW-rHYUxy1r7NTQ/s400/Cube01.jpg" alt="" id="BLOGGER_PHOTO_ID_5073567238356282066" border="0" /></a><br />The Cube plugin in Compiz or Compiz-Fusion could change the desktop into a virtual cube, and each face of the cube is one of the virtual desktop. Normally, the "rotation" of the cube -- rotate the cube to swith to other virtual desktop on the face -- could be done by keyboard or mouse. However, what if I wanted to rotate it with some shell script?<span class="fullpost"><br /><br />There are many ways to control the rotation of the cube, such as <a href="http://ubuntuforums.org/showthread.php?t=612358">doing so with the DBus objects provided by Compiz.</a> Besides that, is it possible to rotate the cube with <a href="http://sweb.cz/tripie/utils/wmctrl/">wmctrl</a>, a useful command line tool that could interact with EWMH/NetWM compatible Window Manager? Although wmctrl does not directly support cube rotation in Compiz, the answer is still "YES!"<br /><br />To control the cube rotation with wmctrl, first of all, we should find out how Cube plugin actually does when managing the virtual desktops. Take my laptop for example. I use the Compiz-Fusion came with Ubuntu 7.10, and the display resolution was set to 1024x768 due to the low hardware specification of my laptop. Let's see what we get with "wmctrl -d" when we are on the first face of the cube:<br /><blockquote class="terminal"><br />$ wmctrl -d<br />0 * DG: 4096x768 VP: 0,0 WA: 0,25 1024x718 N/A<br /></blockquote><br />The "DG" in the output is Desktop Geometry, "VP" is the coordinates of the Viewport Position, and "WA" is the coordinates and the geometry of the WorkArea. The last "N/A" is the desktop name, and, since I did not give specific name to the virtual desktop, it is "Not Available." From the output of wmctrl, we can assume that, the Cube is actually a very wide virtual desktop that wrap around the cube. In order to verify our assumption, let's rotate to the right hand side, and run "wmctrl -d" in the second face of the cube:<br /><blockquote class="terminal"><br />$ wmctrl -d<br />0 * DG: 4096x768 VP: 1024,0 WA: 0,25 1024x718 N/A<br /></blockquote><br />We can find that, the only difference between these two tests is that, the X coordinate of the VP is changed! With this result, we can conlude that, the Cube is actually a very wide desktop, and the virtual desktop we see on each face is actually a viewport to that desktop. This also explains that why the application windows could be shown across the boundary of two faces. Known that each face of the cube is actually a viewport, now we can achieve the rotation with wmctrl by simply changing the current viewport position!<br /><br />I wrote a simple BASH script for this:<br /><blockquote class="code"><br />#!/bin/bash<br />#<br /># compiz-rotate-wmctrl - Rotate the cube using wmctrl<br />#<br /># Author: Shang-Feng Yang <storm dot sfyang at gmail dot com><br /># Released under GPLv3<br /><br />VER="1.0"<br /><br /><br />function rotate() {<br /> # The target face number (begins with 0)<br /> TVPN=$(( $1 % ${NF} ))<br /><br /> # The X coordinate of the target viewport<br /> TVPX=$(( ${TVPN} * ${WW} ))<br /><br /> # Change to the target viewport<br /> wmctrl -o ${TVPX},0<br />}<br /><br />function usage() {<br /> echo -e "$(basename $0) v${VER}\n"<br /> echo -e "Usage:\n"<br /> echo -e "\t$(basename $0) {left|right|#}\n"<br /> echo -e "\tWhere:\n"<br /> echo -e "\t\tleft - rotate the cube to the left"<br /> echo -e "\t\tright - rotate the cube to the right"<br /> echo -e "\t\t# - rotate to #th face (begins with 0)\n\n"<br /> echo -e "Author: Shang-Feng Yang <storm dot sfyang at gmail dot com>"<br /> echo -e "Released under GPLv3"<br />}<br /><br /># The action to be performed. $ACT could be 'left' or 'right' to rotate<br /># left or right, accordingly. $ACT could also be the number of the face <br /># to rotate into.<br />ACT=$(echo $1 |tr '[A-Z]' '[a-z]')<br /><br />[ "x$ACT" == "x" ] && { usage; exit 1; } || {<br /> case $ACT in<br /> left|right|[0-9]|[0-9][0-9])<br /> ;;<br /> *)<br /> usage<br /> exit 1<br /> ;;<br /> esac<br />}<br /><br /><br /># The informations about the desktop<br />INFO=$(wmctrl -d)<br /># The width of the desktop<br />DW=$(echo "${INFO}"| awk '{sub(/x[0-9]+/, "", $4); print $4}')<br /># The width of the workarea<br />WW=$(echo "${INFO}"| awk '{sub(/x[0-9]+/, "", $9); print $9}')<br /># The number of faces on the cube<br />NF=$(($DW/$WW))<br /># The X coordinate of the viewport<br />CVPX=$(echo "${INFO}" |awk '{sub(/,[0-9]+/, "", $6); print $6}')<br /># Current number of the face in all faces (begins with 0)<br />CVPN=$(( ${CVPX} / ${WW} ))<br /><br />[ "$ACT" == "right" ] && {<br /> ACT=$(( ${CVPN} + 1 ))<br />} || {<br /> [ "$ACT" == "left" ] && {<br /> ACT=$(( ${CVPN} - 1 ))<br /> }<br />}<br /><br />rotate ${ACT}<br /></blockquote><br /><br />To use the script,<br /><ul><br /><li>if you didn't specify the parameters, or gave wrong parameters, a short usage information would be shown:<br /><blockquote class="terminal"><br />$ compiz-rotate-wmctrl <br />compiz-rotate-wmctrl v1.0<br /><br />Usage:<br /><br /> compiz-rotate-wmctrl {left|right|#}<br /><br /> Where:<br /><br /> left - rotate the cube to the left<br /> right - rotate the cube to the right<br /> # - rotate to #th face (begins with 0)<br /><br /><br />Author: Shang-Feng Yang <storm dot sfyang at gmail dot com><br />Released under GPLv3<br /></blockquote><br /></li><br /><li>or you specify "left" to achive a left hand rotation:<br /><blockquote class="terminal"><br />$ compiz-rotate-wmctrl left<br /></blockquote><br /></li><br /><li>or "right" for a right hand rotation:<br /><blockquote class="terminal"><br />$ compiz-rotate-wmctrl right<br /></blockquote><br /></li><br /><li>or given a face number, begins with 0, to rotate to the specified face:<br /><blockquote class="terminal"><br />$ compiz-rotate-wmctrl 3<br /></blockquote><br /></li><br /></ul><br /><br />Where could this script be used? Well, it would be much easier to use such a kind of script to control the rotation when you use touchscreen to control the computer, or when you remotely control the desktop through network or bluetooth PAN.<br /></span>Shang-Feng Yanghttp://www.blogger.com/profile/06302302011922621920noreply@blogger.com16tag:blogger.com,1999:blog-22175027.post-1142826062951679892006-03-20T11:36:00.000+08:002006-03-20T11:41:02.963+08:00The reason for low frequency posting on this blogAlthough it is a little higher in my <a href="http://sfyang.blogspot.com/">Chinese version blog</a>, my frequency for posting this blog is quite low. The reason for the low frequency is that, I'm a busy and lazy man. Besides, the Blogger system was quite unstable recently. I cound not publish my blog some days ago. Hmm....Shang-Feng Yanghttp://www.blogger.com/profile/06302302011922621920noreply@blogger.com0tag:blogger.com,1999:blog-22175027.post-1141353943872018362006-03-03T10:18:00.000+08:002006-03-03T10:45:43.883+08:00Creating my own printing spoolThe OIT (Office of Information Technology) provides a PostScript massive printing service called central-ps, of which there is no quota limitation but can only print two side black and white printings. Besides, there is usually three delivery times per day that the printouts are delivered to the public bins and people can pick up their printous at there.<br /><br />This service is quite usefull for me, especially for that I have lots of documents to print in this semester. There are basically three ways to submit jobs to the central-ps printing queue:<br /><ul><li>Through the computers in some special locations that share the printer through Samba</li><li>Through the <a href="http://print.labs.gatech.edu/">web printing service</a></li><li>Through the <a href="http://faq.oit.gatech.edu/0232.html">prnt command on the Solaris workstation</a></li></ul>The first one is the most straight-forward method that requires no file format converting. However, OIT seems to have no plan to allow the printer sharing to all IPs within the campus. Although it requires file converting for the remaining two methods, the web printing is also convenient. But there are some scripting error within the web printing pages that it always causes the browsers other than IE to complain that there is a time-consuming script by continuing running which the system could stop responding. It is not effect for the print jobs if stop the script, but it feels bad to see some kind of message like this. The last one seems to be the most sophisticated one for submitting jobs, but, since the command line is one of the most powerful tool for UN*X, I can play some fun game with this!<br /><br />The idea is simple:<br /><ol><li>Write a script that monitors a specific directory. If there are PostScript files in that directory, submit them with prnt command, move the success files into other directory, also move the failed ones into another directory, and mail a notice to me.</li><li>Schedule a cron job that execute that script every 10 minutes.</li></ol>After setting up this, I have a "private print spool" that automatically submits the files! All I have to do now are just converting the file, transfer the file to that directory, and then pick up the printouts after the delivery time!<br /><br />UN*X and command line rocks!Shang-Feng Yanghttp://www.blogger.com/profile/06302302011922621920noreply@blogger.com0tag:blogger.com,1999:blog-22175027.post-1141350872938442582006-03-03T09:16:00.000+08:002006-03-03T09:54:32.956+08:00Minimo v0.013 vs. Dell Axim x51vMy PocketPC, Dell Axim x51v, uses M$ Windows Mobile 5.0 (WM5) as OS. However, since WM5 did not fully compatible to earlier versions like WM2003, lots of applications that run on WM2003 can not run correctly on WM5. The applications built in WM5 are quite "basic", and PocketIE is quite sucks that it now always render ALL pages into blank without known reasons. This is the second time that PocketIE fails. On the first failure, it backs to normal by deleting all caches and cookies, but this does not work this time. I am absolutely not willing to hard-reset only to let the sucking PocketIE back to normal!<br /><br />I personally dislike IE for that it does not comply to standard and has no tab-browsing support. And I don't like PocketIE either for that it lacks lots of features that makes visiting some website extremely difficult. So I had tried the <a href="http://www.mozilla.org/projects/minimo/">Minimo for PocketPC</a> (MinimoCE) after I got my Axim x51v, even before the PocketIE went strange. Although MinimoCE supports WM2003, but it seems not worked on WM5 for the version older than v0.009. Version 0.009 did run on WM5, but it is not very stable. Version 0.010 is much more stable, but the UI is not very suitable for the screen resolution of my Axim x51v. So when v0.011 released, I installed it immediately. But v0.011 went even worse -- it does not start and just shows the splashing. Although the stopped Minimo does not hang the whole system, it is not usable in this situation. Originally, I though this was caused by some bug in v0.011, but, when v0.012 released and I got the same result, I think that maybe it is just incompatible with my device.<br /><br />A few weeks ago, MinimoCE v0.013 released, and it officially claims to be WM5 compatible. I, of course, install it and want to get rid of the almost expired Opera Mobile, but, unfortunately, it still does not work on my device. When I execute it, after the splashing, a error message popups:<br /><br /><blockquote>TypeError: securityUI has no properties</blockquote><br />In the popup screen, it requests user to report this as a bug. I tried to report this to Minimo project, but, since the bugzilla system of Minimo requires a user account, and I am a little too lazy to create one only for reporting this bug, I did not do this. Instead, I did some search on Minimo's project page, and found something interesting in the <a href="http://forums.mozillazine.org/viewforum.php?f=47">Minimo forum of mozillaZine</a>: there are some people of which has the same problem on MinimoCE v0.012 with his/her Axim x51v, but someone found that Minimo works if the x51v is hard-reset and then MinimoCE is re-installed. Furthermore, some other people who were not willing to do hard-reset found that it can work if the <a href="http://forums.mozillazine.org/viewtopic.php?t=359814&highlight=axim+x51v#1980335">MinimoCE is reinstalled after completely remove the old installation files</a>.<br /><br />Since I have no idea whether this also works for v0.013 for my case or not, I actually do the following steps:<br /><ol><li>Uninstall "Mozilla Minimo" from Setting->Remove Programs</li><li>Delete completely the \Program Files\Minimo folder</li><li>Delete completely the \Windows\Mozilla folder</li><li>Reinstall MinimoCE v0.013</li></ol>And, it works like a charm!<br /><br />The UI for Minimo v0.013 is re-designed that it looks much better on PocketPC than the older versions. Although it is has some problem with the software on-screen keyboard, it is a usable browser for me now.<br /><br />The following screenshot is the one shown on the Minimo project page. Maybe I'll take one of mine later.<br /><br /><img src="http://www.mozilla.org/projects/minimo/images/new_ui_1.PNG" alt="Minimo's new UI on version 0.013" />Shang-Feng Yanghttp://www.blogger.com/profile/06302302011922621920noreply@blogger.com0tag:blogger.com,1999:blog-22175027.post-1139602383090771082006-02-11T03:55:00.000+08:002006-02-11T09:37:14.900+08:00Skype for PocketPC Sucks!My Dell Axim x51v uses Windows Mobile 5 as OS, and older version of Skype for PocketPC could not run on that platform. Although there was a alpha version of Skype that support WM5 released on the forum last year, it was not very stable and was not fully functional.<br /><br />Last month, Skype finally released the verion 1.2.0.89 which officially support WM5 platform. However, since I speak Traditional Chinese, I set the language option to "Chinese (Traditional)", but the UI messages were actually in Simplified Chinese, except for the starting screen. When I changed the language option to Simplified Chinese, I got the Traditional Chinese ones. Although this is not a big deal, I mailed this error to Skype.com. But I got neither any reply nor any fixed new version.<br /><br />Today, I occasionally visited the Skype website, and found that new 2.0.0.39 version of Skype for PocketPC is available. I downloaded and installed it. Since I set my language option to "Chinese (Simplified)" in the 1.2.0.89, the 2.0.0.39 uses that as UI language setting. After some check, I though the language error had been fixed, and I switched the language option back to "Chinese (Traditional)." Do you think that things all go well this time? Absolutely not! Skype refused to switch language and poped up an error message said that my system does not support the language I select. What the hell this could be happened? Skype for PocketPC Sucks!<br /><br />Well, although the UI refused to change language, this does not mean that I can not change it by myself. After some digging into the Skype's configuration files, I found that the UI language setting is actually recorded in the "shared.xml" at "\Application Data\Skype." The configuration file is in XML format, and the language option is the integer value of the "<language>" element which is a sub-element of the "<ui>" element. But the question now is, what is the value for Traditional Chinese? After some trial-and-error runs, I found that "1" is for Traditional Chinese. After modified that value to 1, I got my Skype for PocketPC in Traditional Chinese!</ui></language>Shang-Feng Yanghttp://www.blogger.com/profile/06302302011922621920noreply@blogger.com0tag:blogger.com,1999:blog-22175027.post-1139461345881011752006-02-09T12:59:00.000+08:002006-02-09T13:02:25.890+08:00My new blog is hereIt's long time since my last weblogged on my experimental XOOPS2+weBlog system.<br /><br />I create two blogs, one for <a href="http://sfyang.blogspot.com/">Traditional Chinese</a>, and the other one (this one) for English. All of them use UTF-8 encoding.Shang-Feng Yanghttp://www.blogger.com/profile/06302302011922621920noreply@blogger.com0