Subversion Repositories Misc

[/] [dslmap/] [forgottenislanderbot/] [fib.html] - Blame information for rev 17

Details | Compare with Previous | View Log

Line No. Rev Author Line
1 15 art
 
2
<html>
3
        <head>
4
                <title>forgottenislanderbot -- Documentation</title>
5
                <style type="text/css">
6
                        body {font-family: Helvetica, Ariel, sans, sans-serif;}
7
                        .toc li {list-style:none;}
8
                        div.cmd {font-family:Courier New, Courior, Monospaced, Monospace; font-size: 14pt;text-indent:2em;background-color:#BBBBBB;}
9
                        .attn {background-color:#ff0000;}
10
                        .footer {color:#999999;text-align:center;}
11
                </style>
12
        </head>
13
        <body>
14
                <a name="0" />
15
                <h1>forgottenislanderbot -- Documentation</h1>
16
                <p>
17
                        forgottenislanderbot is a set of computer programs for
18
                        gathering data on Bell Aliant's <abbr title="Digital Subscriber Loop">DSL</abbr> high-speed internet
19
                        coverage on PEI by querying Bell Aliant's website, and
20
                        displayed the data on a virtual globe, like Google
21
                        Earth.
22
                </p>
23
                <hr />
24
                <a name="1" />
25
                <h2>1. Table of Contents</h2>
26
                <div class="toc">
27
                        <ul>
28
                                <li>
29
                                        <a href="#1">1. Table of contents</a>
30
                                </li>
31
                                <li>
32
                                        <a href="#2">2. Overview</a>
33
                                </li>
34
                                <li>
35
                                        <a href="#3">3. Installation</a>
36
                                </li>
37
                                <li>
38
                                        <a href="#4">4. Use</a>
39
                                        <ol>
40
                                                <li>
41
                                                        <a href="#4.1">4.1. Using the command-line</a>
42
                                                </li>
43
                                                <li>
44
                                                        <a href="#4.2">4.2. Getting the civic-address database</a>
45
                                                        <ol>
46
                                                                <li>
47
                                                                        <a href="#4.2.1">4.2.1. Downloading the raw data</a>
48
                                                                </li>
49
                                                                <li>
50
                                                                        <a href="#4.2.2">4.2.2. Converting to a <abbr title="ForgottenIslanderBot">FIB</abbr> database</a>
51
                                                                </li>
52
                                                        </ol>
53
                                                </li>
54
                                                <li>
55
                                                        <a href="#4.3">4.3. Crawling for <abbr title="Digital Subscriber Loop">DSL</abbr> coverage</a>
56
                                                </li>
57
                                                <ol>
58
                                                        <li>
59
                                                                <a href="#4.3.1">4.3.1. Robot etiquette</a>
60
                                                        </li>
61
                                                        <li>
62
                                                                <a href="#4.3.2">4.3.2. Planning the crawling</a>
63
                                                        </li>
64
                                                        <li>
65
                                                                <a href="#4.3.3">4.3.3. Running the bot</a>
66
                                                        </li>
67
                                                </ol>
68
                                        </li>
69
                                        <li>
70
                                                <a href="#4.4">4.4. Generating the map</a>
71
                                                <ol>
72
                                                        <li>
73
                                                                <a href="#4.4.1">4.4.1. Producing the <abbr title="Keyhole Markup Language">KML</abbr> file</a>
74
                                                        </li>
75
                                                        <li>
76
                                                                <a href="#4.4.2">4.4.2. Using the <abbr title="Keyhole Markup Language">KML</abbr> with Google Earth</a>
77
                                                        </li>
78
                                                        <li>
79
                                                                <a href="#4.4.3">4.4.3. Distributing the map</a>
80
                                                        </li>
81
                                                </ol>
82
                                        </li>
83
                                </ol>
84
                        </li>
85
                        <li>
86
                                <a href="#5">5. Licenses</a>
87
                        </li>
88
                </ul>
89
        </div>
90
        <a href="#0">[ top ]</a>
91
        <hr />
92
 
93
        <a name="2" />
94
        <h2>2. Overview</h2>
95
        <p>
96
                ForgottenIslanderBot (<abbr title="ForgottenIslanderBot">FIB</abbr>) was written in January 2010 after it
97
                became apparent Bell Aliant was not expanding their <abbr title="Digital Subscriber Loop">DSL</abbr> coverage
98
                across the whole of PEI, and would be offering their 3G modem as
99
                an alternative. Due to numerous concerns about the 3G modems,
100
                and vagueness surrounding the number of addresses <abbr title="Digital Subscriber Loop">DSL</abbr> would not
101
                be available at, the author realized the need for a map of Bell
102
                Aliant's <abbr title="Digital Subscriber Loop">DSL</abbr> coverage, and that such a map would be possible to make.
103
        </p>
104
        <p>
105
                The bot uses the Check Availability tool on Bell Aliant's
106
                website to find out whether addresses in the PEI Civic Address
107
                Database are eligible for <abbr title="Digital Subscriber Loop">DSL</abbr>. This data is only as good as the
108
                website's database, but may be better than any other map
109
                available. To make a map, <abbr title="ForgottenIslanderBot">FIB</abbr> generates a <abbr title="Keyhole Markup Language">KML</abbr> overlay, which can
110
                be displayed in virtual globe software, like Google Earth.
111
        </p>
112
        <p>
113
                forgottenislanderbot is free, open-source software under the GNU
114
                General Public License.
115
        </p>
116
        <a href="#0">[ top ]</a>
117
        <hr />
118
 
119
        <a name="3" />
120
        <h2>3. Installation</h2>
121
        <p>
122
                <abbr title="ForgottenIslanderBot">FIB</abbr> is written in the
123
                <a href="http://www.python.org/">Python programming language</a>.
124
                To run <abbr title="ForgottenIslanderBot">FIB</abbr>,
125
                you will need a computer with the Python software and a fast internet
126
                connection which you can leave running overnight.
127
        </p>
128
        <p>
129
                The Python <i>interpretor</i>, which is required to run <abbr title="ForgottenIslanderBot">FIB</abbr>, can
130
                be downloaded for free from the internet. Python is available
131
                for most recent operating systems. Forgottenislanderbot requires
132
                Python version 2.5 or 2.6 -- it does not work with the newest
133
                version, 3.1, yet. Python 2.6 can be downloaded from
134
                <a href="http://www.python.org/download/releases/2.6.4/">
135
                http://www.python.org/download/releases/2.6.4/</a>.
136
        </p>
137
        <p>
138
                To display the map on a virtual globe, you will need virtual
139
                globe software, such as <a href="http://earth.google.com/">Google Earth</a>
140
                or <a href="http://worldwind.arc.nasa.gov/">NASA World Wind.</a>
141
                Both of those require a recent computer with a mainstream
142
                operating system, as they use 3D graphics and more processor
143
                power and memory.<br />
144
                Note that you do not need to use the <i>same</i> computer to run <abbr title="ForgottenIslanderBot">FIB</abbr> and
145
                display the map.
146
        </p>
147
        <p>
148
                <abbr title="ForgottenIslanderBot">FIB</abbr> can be installed in a folder on your computer, like
149
                <nobr><b>C:\<abbr title="ForgottenIslanderBot">FIB</abbr>\</b></nobr> or <nobr><b>Documents/<abbr title="ForgottenIslanderBot">FIB</abbr>/</b></nobr>,
150
                or it can be run from a folder on a USB drive. To install <abbr title="ForgottenIslanderBot">FIB</abbr>,
151
                unzip the <abbr title="ForgottenIslanderBot">FIB</abbr> zip file into such a folder.
152
        </p>
153
        <a href="#0">[ top ]</a>
154
        <hr />
155
 
156
        <a name="4" />
157
        <h2>4. Use</h2>
158
        <a name="4.1" />
159
        <h3>4.1. Using the command-line</h3>
160
        <p>
161
                <abbr title="ForgottenIslanderBot">FIB</abbr> is a <i>command-line program</i> -- it must be run and
162
                controlled from the command line, also known as the command
163
                prompt, terminal, MS-DOS prompt, xterm, virtual terminal, or
164
                shell. Less-young users may recall MS-DOS, or
165
                U<sub>NIX</sub>.<br />
166
                Command-lines in modern operating systems differ; how to access
167
                it, for a few modern OSs, is described below:
168
                <dl>
169
                        <dt>Microsoft Windows 95/98<sub>, maybe also ME</sub></dt>
170
                        <dd>Start &gt; Programs &gt; Accessories &gt; MS-DOS
171
                        Prompt</dd>
172
                        <dt>Microsoft Windows XP<sub>, maybe also NT/2000, maybe
173
                        Vista</sub><
174
                        <dd>Start &gt; All Programs &gt; Accessories &gt;
175
                        Command Prompt</dd>
176
                        <dt>Apple Mac OS X</dt>
177
                        <dd>Finder &gt; Applications &gt; Utilities &gt;
178
                        Terminal</dd>
179
                        <dt>Linux</dt>
180
                        <dd><i>Due to variation, I'll assume most Linux users know
181
                        where to find a terminal</i></dd>
182
                </dl>
183
        </p>
184
        <p>
185
                Because <abbr title="ForgottenIslanderBot">FIB</abbr> must be run by Python, on most OSs you'll need to
186
                manually invoke it.
187
                <ol>
188
                        <li>Go to your command-line;</li>
189
                        <li>Navigate to the folder <abbr title="ForgottenIslanderBot">FIB</abbr> is installed to.<br />
190
                        <b>cd \<abbr title="ForgottenIslanderBot">FIB</abbr>\</b> or <b>cd Documents/<abbr title="ForgottenIslanderBot">FIB</abbr>/</b> might work,
191
                        depending on where you installed it.</li>
192
                        <li>You need to know how to run Python. If it is
193
                        installed completely, and in your PATH, running
194
                        <div class="cmd">python --version</div> should not
195
                        produce an error message. If not, you can either add
196
                        Python to your PATH, or run Python with its full path,
197
                        for example <div class="cmd">"\Python25\python.exe"
198
                        --version</div></li>
199
                        <li>Once you can run Python, you can run <abbr title="ForgottenIslanderBot">FIB</abbr> with
200
                        commands like the following:
201
                        <div class="cmd"><nobr>python fib-dbutil.py database.sqlite --stats yyyy 1 n</nobr></div>
202
                        Instructions in this document are presented in this format;
203
                        however, if you have to invoke Python differently, you
204
                        will have to change the command as needed.
205
                        </li>
206
                </ol>
207
        </p>
208
        <a name="4.2" />
209
        <h3>4.2. Getting the civic-address database</h3>
210
        <p>
211
                        The bot needs a list of civic addresses to look up on the website.
212
                        The PEI government offers the civic address database online,
213
                        with no apparent constraints on use. It also has
214
                        latitude/longitude coordinates for each address, which
215
                        greatly assist making the map.<br />
216
                        Bell Aliant's website seems to be using the same list,
217
                        as street addresses appear in exactly the same format as
218
                        is used by the provincial database. Also, Google Maps is
219
                        probably also using the provincial database, as the
220
                        coordinates it finds for a civic address are the same.
221
        </p>
222
        <a name="4.2.1" />
223
        <h4>4.2.1. Downloading the raw data</h4>
224
        <p>
225
                        The civic address database has to be downloaded and
226
                        converted to a sqlite database before the bot can use
227
                        it. To get the database, go to <a href="http://www.gov.pe.ca/civicaddress/download/index.php3">http://www.gov.pe.ca/civicaddress/download/index.php3</a>
228
                        and download whichever counties you need (which in most
229
                        cases will be all three):
230
                        <ol>
231
                                <li>Select a county in the drop-down box.</li>
232
                        <li>Click <i>Download all addresses in this
233
                                county</i>.</li>
234
                        <li>It will load a new page.</li>
235
                        <li>Make sure <i>Tab-delimited ASCII</i> is
236
                                selected as the Download Format.</li>
237
                        <li>Make sure <i>Street Number</i>, <i>Street
238
                                Name</i>, <i>Community Name</i>, <i>Apartment
239
                                Number</i>, <i>County</i>, <i>Latitude</i>,
240
                                <i>Longitude</i> are selected. <i>Police
241
                                Department</i>, <i>Fire Department</i>, and <i>Ambulance</i>
242
                                are unnecessary, but may be selected anyway.</li>
243
                        <li>Click <i>Download the Data</i>. You will probably be asked to
244
                                save the file; save it to the folder <abbr title="ForgottenIslanderBot">FIB</abbr> is
245
                                installed in as <b>[county].tsv</b>.</li>
246
                        <li>Repeat for additional counties.</li>
247
                </ol>
248
        </p>
249
        <a name="4.2.2" />
250
        <h4>4.2.2. Converting to a <abbr title="ForgottenIslanderBot">FIB</abbr> database</h4>
251
        <p>
252
                As downloaded, the civic address databases are in
253
                <i>tab-separated-values</i> (TSV) format, where each field, e.g. civic
254
                number, is separated from the next, e.g. street name, by a tab
255
                character, with one address per line. <abbr title="ForgottenIslanderBot">FIB</abbr> works with
256
                <i>sqlite</i> databases, which are files containing data in a
257
                non-human-readable format, but which allow for easier searching,
258
                organizing, and updating of the data.<br />
259
                <abbr title="ForgottenIslanderBot">FIB</abbr> has a command to combine the three TSV files into one sqlite
260
                file. Assuming the TSV files are <b>queens.tsv</b>,
261
                <b>kings.tsv</b>, and <b>prince.tsv</b>, and they are in the
262
                same folder as <abbr title="ForgottenIslanderBot">FIB</abbr>, run:
263
                <nobr><div class="cmd">python fib-cadb2sql.py database.sqlite
264
                queens.tsv kings.tsv prince.tsv</div>
265
                </nobr>
266
                It should output a file called <b>database.sqlite</b>. When the
267
                bot is run, it will read civic addresses from this file, and
268
                write to it whether <abbr title="Digital Subscriber Loop">DSL</abbr> is available at each address.
269
        </p>
270
        <a name="4.3" />
271
        <h3>4.3. Crawling for <abbr title="Digital Subscriber Loop">DSL</abbr> coverage</h3>
272
        <p>
273
                The sqlite civic address database has an empty column for <abbr title="Digital Subscriber Loop">DSL</abbr>
274
                availability. To fill it in is where the real <b>bot</b> part of
275
                forgottenislanderbot comes in.
276
        </p>
277
        <a name="4.3.1" />
278
        <h4>4.3.1. Robot etiquette</h4>
279
        <p>
280
                <abbr title="ForgottenIslanderBot">FIB</abbr> is in a class of computer programs known as <i>bots</i>,
281
                <i>robots</i>, <i>web robots</i>, <i>webbots</i>, or <i>crawlers</i>.
282
                These programs load web pages by themselves, without human
283
                help -- they often load more web pages more quickly than
284
                a human could, and this is usually what they're written for. A
285
                subset of bots known as <i>spiders</i> follow web links from
286
                site to site; <abbr title="ForgottenIslanderBot">FIB</abbr> is not one of these.
287
        </p>
288
        <p>
289
                Because of robots' abilities to request a vast number of web
290
                pages, they are sometimes viewed as a nuisance or a threat. Some
291
                webmasters (if they are on top of things) might not want bots to
292
                view certain pages on their website (like a <abbr title="Digital Subscriber Loop">DSL</abbr> availability
293
                page), or they may not want bots at all. To let co-operative
294
                bots know of such policies, a standard exists around a
295
                <i>robots.txt</i> file, which can be put in the highest level of
296
                a web server. Compliant bots will check for this file, and
297
                interpret it to determine what they may view.
298
        </p>
299
        <p>
300
                Although <abbr title="ForgottenIslanderBot">FIB</abbr> checks for a robots.txt, and to an extent will try
301
                to check what it is permitted, if it is forbidden to access its
302
                target, it will ask the operator for permission to override it,
303
                and fetch the data anyway. Due to this override, <abbr title="ForgottenIslanderBot">FIB</abbr> was written
304
                to view robots.txt as an inconvenient formality, and offload the
305
                ethical burden of overriding to the operator from the
306
                programmer.<br />
307
                When I last checked, Bell Aliant did not have a robots.txt file.
308
                If <abbr title="ForgottenIslanderBot">FIB</abbr> has much of an effect, they just might put one in. ;-)<br />
309
                Furthermore, there was nothing in their website Terms of Service
310
                forbidding bots specifically -- just DoS attacks.
311
        </p>
312
        <p>
313
                A <i>Denial-of-Service (DoS)</i> attack is a hostile action where
314
                a webserver is bombarded with numerous requests, from malicious
315
                bots. At some point it will become overloaded and be unable to
316
                serve any more pages -- hence the denial-of-service. Robot
317
                operators and programmers must be careful to avoid appearing as
318
                a DoS attack.<br />
319
                Many web servers have monitoring software, which can sound
320
                alarms if unusually large amounts of page requests are received.
321
                Law-abiding bot operators should avoid generating alarming
322
                amounts of internet traffic.
323
        <p>
324
                <abbr title="ForgottenIslanderBot">FIB</abbr> is <i>single-threaded</i> -- it requests web pages one at a
325
                time -- so it is unlikely to appear as a DoS threat. (Most DoS
326
                attackers use between twenty and several thousand requests per
327
                second.) Furthermore, <abbr title="ForgottenIslanderBot">FIB</abbr> is easily configured to appear as less
328
                of a threat:<dl>
329
                        <dt><b>interval</b>, or <b>delay</b>
330
                                </dt>
331
                                <dd>the time the bot pauses between loading each
332
                        address' <abbr title="Digital Subscriber Loop">DSL</abbr> status. Given to the bot in seconds, but
333
                        can be expressed as a decimal. The longer the interval,
334
                        the less threatening it appears to any monitoring
335
                        software.</dd>
336
                                <dt>
337
                                        <b>sparsity</b>
338
                                </dt>
339
                                <dd>how many addresses to skip, and not check.
340
                        Technically, the sparsity is the minimum difference
341
                        between civic numbers of checked addresses, evaluated on
342
                        a per-road basis. To get every house, set to 1. To get
343
                        at least one house per road, and about every thousandth
344
                        house on the same road, set to 1000. For example, with a
345
                        sparsity of 10, it would check #500, if it was
346
                        the first house on the road, not #509, but also
347
                        #510, also #560, and, if the next was
348
                        #610, that too.</dd>
349
                        </dl>
350
                In practice, a one-second delay is probably safe. The PEI civic
351
                address database contains, at last check <nobr>68 023</nobr>
352
                addresses (homes and businesses). At a low sparsity, say 40, the
353
                bot will get about <nobr>10 000</nobr> addresses -- more than
354
                enough for a detailed map. However, if you check all <nobr>68
355
                023</nobr> addresses (with a sparsity of 1), not only will you
356
                be able to produce maps with any density (see <a href="#4.4.1">
357
                section 4.4.1</a> for details on this), but you will have a
358
                precise number of how many addresses Bell Aliant's website says
359
                can't get <abbr title="Digital Subscriber Loop">DSL</abbr>! With low sparsity, you'd have to estimate, which
360
                isn't quite as accurate. The highest sparsity possible -- one
361
                per road (which can be entered as 2781, or higher) -- without
362
                excluding cities -- is still 5108 addresses.
363
                As they are the first on a road, not the last, that might give a
364
                lower count of no-<abbr title="Digital Subscriber Loop">DSL</abbr> addresses than actually exist.
365
        </p>
366
                <a name="4.3.2" />
367
                <h4>4.3.2. Planning the crawling</h4>
368
                <p>
369
                Before running the bot, you should decide on the sparsity and
370
                the delay. In choosing these, you should be aware how long it
371
                will take to run. To find out, you must know a) how many
372
                addresses your given sparsity will check, and b) how long of a
373
                delay to allow, plus how long it takes to check an address.
374
        </p>
375
                <p>
376
                One of the forgottenislanderbot tools is lets you see how many
377
                addresses will be produced by a given sparsity. Assuming your
378
                civic-address sqlite database is <b>database.sqlite</b>, and it
379
                is in the same folder as <abbr title="ForgottenIslanderBot">FIB</abbr>, run:
380
                <div class="cmd"><nobr>python fib-dbutil.py database.sqlite
381
                --stats yyyy <i>sparsity</i>
382
                                        <i>n</i>
383
                                </nobr>
384
                        </div>
385
                        <br />
386
                where
387
                <dl>
388
                        <dt>sparsity</dt>
389
                                <dd>is the sparsity, for example 40, and</dd>
390
                                <dt>n</dt>
391
                                <dd>is <b>n</b>, as in no, unless you wish to
392
                        specifically exclude Charlottetown and Summerside (on
393
                        the assumption high-speed is readily available), in
394
                        which case you use <b>y</b>.</dd>
395
                        </dl>
396
                That command will produce output like the following:
397
                <pre>           ERROR :          0
398
                EMPTY :          5109
399
                NODSL :          0
400
                BASIC :          0
401
                ULTRA :          0
402
                TOTAL :          5109</pre>
403
                The second-to-top value, EMPTY, is the one to observe -- with that
404
                sparsity (2780), 5109 addresses would be checked. As this example
405
                shows an empty database, TOTAL can also be used as the guideline.
406
        </p>
407
                <p>
408
                To estimate the time the bot will take to run, multiply the
409
                number of addresses to be checked by the
410
                time to check each address, which is the sum of the actual time
411
                to load the page and the bot's delay. On a dial-up connection, the
412
                page might take 12 seconds to load, on broadband, it might take
413
                only <sup>1</sup>/<sub>20</sub>
414
                        <sup>th</sup> of a second,
415
                although in practice it could easily take one second. For
416
                example:<br />
417
                        <nobr>
418
                                <b>68 000 &#215; ( <sup>1</sup>/<sub>2</sub> + 1 ) = 102 000
419
                seconds</b>
420
                        </nobr>
421
                        <br />
422
                Divide seconds by 3600 to find hours -- in this example,
423
                28<sup>1</sup>/<sub>3</sub>.
424
        </p>
425
                <p>
426
                <abbr title="ForgottenIslanderBot">FIB</abbr> stores <abbr title="Digital Subscriber Loop">DSL</abbr>-status values in its sqlite database, and will
427
                not recheck an address if its status is already known. Because
428
                of this, you can run it incompletely, getting more of the database
429
                checked each time. Furthermore, you can run it at decreasing
430
                sparsity -- for example, first with a sparsity of 1000, then 40,
431
                then 10, then 1. The easiest way to stop <abbr title="ForgottenIslanderBot">FIB</abbr>, as described in <a href="#4.1">section 4.1</a>, is to press <b>Control-C</b> at the
432
                command-line it's running in.
433
        </p>
434
                <a name="4.3.3" />
435
                <h4>4.3.3. Running the bot</h4>
436
                <p>
437
                You can continue to use the computer <abbr title="ForgottenIslanderBot">FIB</abbr> runs on -- it doesn't
438
                use very much memory or processor power, and, with its delay,
439
                not much network bandwidth. If the computer crashes, a few
440
                addresses will have been forgotten, but once the bot is started
441
                again it will recheck them. Keep in mind, though, that the
442
                computer <abbr title="ForgottenIslanderBot">FIB</abbr> is running on will need to be left on, and logged
443
                in.
444
        </p>
445
                <p>
446
                Once you have decided on a sparsity and delay, you can invoke
447
                the bot. Assuming your sqlite database is <b>database.sqlite</b>
448
                and it's in the same folder as <abbr title="ForgottenIslanderBot">FIB</abbr>, run:
449
                <div class="cmd"><nobr>python fib-crawlbot.py database.sqlite 1 1 n</nobr>
450
                        </div>
451
                where, respectively,
452
                <dl>
453
                        <dt>1</dt>
454
                                <dd>is the sparsity (every house),</dd>
455
                                <dt>1</dt>
456
                                <dd>is the delay (one second), and</dd>
457
                                <dt>n</dt>
458
                                <dd>as in no, means not to skip the cities.</dd>
459
                        </dl>
460
                After a few seconds, in which there is a small chance of it
461
                asking you to override a robots.txt, it will begin testing
462
                addresses. It will display the full address of each it checks,
463
                with a rough progress bar of dots, until it displays
464
                <b>1</b>,<b>2</b>, or <b>3</b>, which correspond to no <abbr title="Digital Subscriber Loop">DSL</abbr>, 1.5
465
                Mbps <abbr title="Digital Subscriber Loop">DSL</abbr>, and 7 Mbps <abbr title="Digital Subscriber Loop">DSL</abbr>, respectively. It will then sleep for
466
                the specified delay, and go on to the next address.<br />
467
                You can (and, unless you're highly bored, <i>should</i>) leave
468
                it until it finishes, at which point it will write <b>done</b>
469
                and return the the command-line. If you tell it to get the full
470
                database, come back later, and find it done, you could run the
471
                same command again -- it will only re-get addresses it failed to
472
                fetch the first time (it should display error messages when that
473
                happens).
474
        </p>
475
                <p>
476
                Once the bot has fetched the full database, you can confirm it
477
                checked every address, and see the raw-number totals, with this
478
                command (assuming your <b>database.sqlite</b> is in the same
479
                folder as <abbr title="ForgottenIslanderBot">FIB</abbr>):
480
                <div class="cmd"><nobr>python fib-dbutil.py database.sqlite
481
                --stats yyyy 1 n</nobr>
482
                        </div>
483
                That will display the complete totals. For an example of the
484
                output table, see <a href="#4.3.2">section 4.3.2</a>. If every
485
                address was successfully checked, ERROR and EMPTY will both be
486
                0. You can see the results in the NO<abbr title="Digital Subscriber Loop">DSL</abbr>, BASIC, and ULTRA rows.
487
        </p>
488
                <a name="4.4" />
489
                <h3>4.4. Generating the map</h3>
490
                <p>
491
                The <b>--stats</b> command provides the totals of each status,
492
                but forgottenislanderbot was written with the intention of
493
                making a map. The map is to be displayed in virtual globe
494
                software, such as <span class="attn">
495
                <a href="http://earth.google.com/">Google Earth</a> or
496
                <a href="http://worldwind.arc.nasa.gov/">NASA World Wind.</a></span>
497
                At this moment, it has only been tested with Google Earth, and
498
                by default uses map icons provided by Google Earth. Although
499
                Google Earth is better-known and has more detailed imagery, it
500
                is copyrighted, with restrictions on use; I think NASA World
501
                Wind's imagery are public-domain, with no constraints on
502
                modification or redistribution.
503
        </p>
504
                <p>
505
                To display <abbr title="Digital Subscriber Loop">DSL</abbr> coverage on a virtual globe, forgottenislanderbot
506
                generates a <abbr title="Keyhole Markup Language">KML</abbr>
507
                file, which contains numerous lat/long coordinates matched with
508
                a coloured icon. Virtual globe software will display the
509
                designated icon on the map at the specified lat/long
510
                coordinates, producing a detailed overlay of <abbr title="Digital Subscriber Loop">DSL</abbr> coverage on
511
                PEI. The <abbr title="Keyhole Markup Language">KML</abbr> file is mostly useless unless it is displayed on a
512
                virtual globe, but because it does not contain Google's imagery,
513
                it can be redistributed widely. The <abbr title="Keyhole Markup Language">KML</abbr> file is usually less
514
                than 4MB in size. However, they can be compressed to under
515
                200kB, in a <abbr title="Keyhole Markup language, Zipped">KMZ</abbr> file,
516
                although <abbr title="ForgottenIslanderBot">FIB</abbr> doesn't support this yet.<br />
517
                For privacy reasons, <abbr title="ForgottenIslanderBot">FIB</abbr> does not retain civic addresses in the
518
                <abbr title="Keyhole Markup Language">KML</abbr> files it produces; this could be enabled, with an increase in
519
                file size.
520
        </p>
521
                <a name="4.4.1" />
522
                <h4>4.4.1. Producing the <abbr title="Keyhole Markup Language">KML</abbr> file</h4>
523
                <p>
524
                All <nobr>68 000</nobr> points are too much to expect a virtual
525
                globe to display, due to memory and processing requirements. To
526
                make lower-resolution maps, <abbr title="ForgottenIslanderBot">FIB</abbr> uses the same sparsity option
527
                employed for crawling. The --stats command (see <a href="#4.3.2">section 4.3.2</a>) can be used to determine how
528
                many points would be in the map. A sparsity of 40 produces a
529
                rather cumbersome map.
530
        </p>
531
                <p>
532
                Assuming your <b>database.sqlite</b> is in the same folder as
533
                <abbr title="ForgottenIslanderBot">FIB</abbr>, and you wish to produce <b>map.<abbr title="Keyhole Markup Language">KML</abbr></b>, run:
534
                <div class="cmd"><nobr>python fib-dbutil.py database.sqlite
535
                --<abbr title="Keyhole Markup Language">KML</abbr> nyyy 40 n</nobr>
536
                        </div>
537
                where
538
                <dl>
539
                        <dt>nyyy</dt>
540
                                <dd>is the <i>status mask</i>--it sets which <abbr title="Digital Subscriber Loop">DSL</abbr>
541
                        statuses to display in the map. It comprises four
542
                        <b>y</b>/<b>n</b> yes/no values, each one defining
543
                        whether to display a particular <abbr title="Digital Subscriber Loop">DSL</abbr> status or not.
544
                        Respectively, the four statuses are unchecked, no <abbr title="Digital Subscriber Loop">DSL</abbr>,
545
                        1.5 Mbps <abbr title="Digital Subscriber Loop">DSL</abbr>, and 7 Mbps <abbr title="Digital Subscriber Loop">DSL</abbr>. In the example, unchecked
546
                        addresses will not be displayed. To display only addresses
547
                        without <abbr title="Digital Subscriber Loop">DSL</abbr> availability, use <b>nynn</b>. Similar terms
548
                        influence the --stats command; </dd>
549
                                <dt>40</dt>
550
                                <dd>is the sparsity; to get every address, use 1,
551
                        although that is inadvisable unless you're only
552
                        displaying no-<abbr title="Digital Subscriber Loop">DSL</abbr> addresses, with 'nynn'; and</dd>
553
                                <dt>n</dt>
554
                                <dd>is no, meaning not to skip the cities.</dd>
555
                        </dl>
556
                </p>
557
                <a name="4.4.2" />
558
                <h4>4.4.2. Using the <abbr title="Keyhole Markup Language">KML</abbr> with Google Earth</h4>
559
                <p>
560
                Once you have generated your <abbr title="Keyhole Markup Language">KML</abbr> file, you have to load it onto
561
                your virtual globe. In some operating systems, you can
562
                double-click the <abbr title="Keyhole Markup Language">KML</abbr> file, or run it like a program from your
563
                command line:
564
                <div class="cmd"><nobr>map.<abbr title="Keyhole Markup Language">KML</abbr></nobr>
565
                        </div>
566
                However, most likely you will load it from your virtual globe
567
                program itself. In Google Earth, click the <i>File>Open...</i> menu
568
                item, select the <abbr title="Keyhole Markup Language">KML</abbr> file (which will probably be in the
569
                same folder as <abbr title="ForgottenIslanderBot">FIB</abbr>), and click <i>Open</i>. If you don't want to
570
                do this each time you launch Google Earth, you can drag the
571
                <i>forgottenislanderbot</i> item from the Temporary Places
572
                branch of the sidebar to inside the My Places branch.<br />
573
                The procedure for NASA World Wind is probably not much
574
                different.
575
                </p>
576 17 art
                <p>
577
                        At present, unchecked addresses appear white,
578
                        no-<abbr title="Digital Subscriber Loop">DSL</abbr> addresses
579
                        appear yellow, 1.5 Mbps <abbr title="Digital Subscriber Loop">DSL</abbr>
580
                        addresses appear cyan, and 7 Mbps <abbr title="Digital Subscriber Loop">DSL</abbr>
581
                        addresses appear green.
582
                </p>
583 15 art
                <a name="4.4.3" />
584
                <h4>4.4.3. Distributing the map</h4>
585
                <p>
586
                        The <abbr title="Keyhole Markup Language">KML</abbr> file can be distributed through email,
587
                        websites/blogs, or individual ways of transfering
588
                        digital data, like CDs or USB drives. They can be
589
                        compressed in zip archive files, so long as it is
590
                        unzipped before use. Zipping does not help a <abbr title="Keyhole Markup language, Zipped">KMZ</abbr> file, though.<br />
591
                        Images from virtual globes can be saved, although you
592
                        may want to check the imagery licenses before
593
                        distributing them.
594
                </p>
595
                <a href="#0">[ top ]</a>
596
                <hr />
597
                <a name="5" />
598
                <h2>5. Licenses</h2>
599
                <h3>forgottenislanderbot</h3>
600
                <p><pre>
601
                forgottenislanderbot - makes a <abbr title="Digital Subscriber Loop">DSL</abbr> coverage map from Bell Aliant's website.
602
                Copyright (c) 2010 Art Ortenburger
603
 
604
                This program is free software; you can redistribute it and/or
605
                modify it under the terms of the <a href="http://www.gnu.org/licenses/gpl.txt">GNU General Public License</a>
606
                as published by the Free Software Foundation; either version 2
607
                of the License, or any later version.
608
 
609
                This program is distributed in the hope that it will be useful,
610
                but WITHOUT ANY WARRANTY; without even the implied warranty of
611
                MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
612
                GNU General Public License for more details.
613
 
614
                You should have received a copy of the GNU General Public License
615
                along with this program; if not, write to the Free Software
616
                Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
617
 
618
                </pre></p>
619
                <h3>fib.html -- this documentation</h3>
620
                <p>
621
                        This document is licensed under the Creative Commons
622
                        Attribution-ShareAlike Canada license. See footer for details.
623
                </p>
624
                <h3>where to find licenses for data used by <abbr title="ForgottenIslanderBot">FIB</abbr></h3>
625
                <p>
626
                        PEI government <a href="http://www.gov.pe.ca/civicaddress/">civic address databases</a> are not
627
                        specifically licensed on the government website, but
628
                        <ol>
629
                                <li>the <a href="http://www.gov.pe.ca/index.php3?number=1024403&lang=E">website copyright page</a> states that
630
                                information on the site may be reproduced
631
                                without further permission for non-commerical
632
                                use, and</li>
633
                                <li>the <a href="http://www.gov.pe.ca/civicaddress/">civic address main page</a> lists the civic
634
                                address database with the subheading "USE IN
635
                                YOUR OWN APPLICATIONS".</li>
636
                        </ol>
637
                </p>
638
                <p>
639
                        Users of Google Earth must comply with Google's license
640
                        agreements for Google Earth, which can be found on Google's
641
                        website. Google Earth imagery may be under a separate license.
642
                </p>
643
                <p>
644
                        NASA World Wind must be used in accordance with NASA
645
                        Open Source Agreement, which can be found on the NASA
646
                        World Wind website. NASA World Wind provides multiple
647
                        imagery layers. Some of these are in the public domain;
648
                        those which aren't are subject to a license agreement.
649
                </p>
650
                <p>
651
                        The Bell Aliant webpages are not retained by <abbr title="ForgottenIslanderBot">FIB</abbr>, and
652
                        the data stored is derived from the presence of various
653
                        phrases in the webpage, rather than an actual excerpt
654
                        from the website. Use of the Bell Aliant website is
655
                        governed by multiple agreements and terms of service,
656
                        which they administer on their website.
657
                </p>
658
                <a href="#0">[ top ]</a>
659
                <hr />
660
                <div class="footer">Copyright &copy; 2010 Art Ortenburger.<br />
661
                <a rel="license" href="http://creativecommons.org/licenses/by-sa/2.5/ca/"><img alt="Creative Commons License" style="border-width:0" src="http://i.creativecommons.org/l/by-sa/2.5/ca/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/2.5/ca/">Creative Commons Licence</a>.</div>
662
        </body>
663
</html>