Friday, November 30, 2012

Week 14: IT Issues: Security and Privacy

Total "Terrorism" Information Awareness (TIA)

NY Times reported that a tracking system called Total Information Awareness (TIA) was being built by the Defense Advanced Research Projects Agency (DARPA), to give law enforcement access to information leading to possible terrorists.

This would allow them to view private data without suspicion of wrongdoing or a warrant.

TIA planned on using information signatures, computer algorithms and human analysis to assist the government in tracking terrorist.

A database would be created to act as a repository of information for a large number of people and their search information.

Financial records, medical records, communication records, and travel records and intelligence data would be contained in this database (along with other information).

Data mining was key to navigate this large amount of information.

A biometric technology was created to enable the identification and tracking of individuals. This would allow individuals to be identified from a distance and would greatly help in the TIA process.

Funding to the program was cut in 2003 and the office that created it was closed. However, similar projects from Intelligence Community Advanced Research and Development Activity (ARDA) and a joint project from the FBI and the Transportation Security Administration.

Chapter 10 from No Place to Hide

by Robert O'Harrow, Jr.

“No Place to Hide” details how the government is able to track information about people through almost everything they do. Such as:

Metro Cards

Cameras at varying locations

Credit Cards

Clocking into work

GPS devices

Phone

Television (tiVo)
E-Z Pass

Toll Booths

Car Keys

800 phone calls that track your conversation

Hotel Keys

Logins to websites track everything- like newspapers use your login to track what you read for example

Police are using it to revolutionize the way they think, as in the case with missing police officer Johnathan Luna, who was found dead in a PA creek by tracking his toll credits.

Attorneys use it as well to track clients

RFID is what makes everything work.

There is no end to the kinds of product monitoring or personal surveillance by companies, law enforcement, or private investigators.

Some schools and jails are even using this technology.

Schools are using this for automatic attendance.

Requiring students to wear ID’s with the tag inside it will identify what students are there and even bring up a picture of them- just by entering the building.

SafeTzone is a product being used by amusement parks to help parents track their children, pay for things without using wallets and decreasing waiting time for rides.

This is also changing customer relations, allowing business to “know” customers as soon as they arrive.

MyTurn: Protecting privacy rights in libraries

By Judah Hamer • September 24, 2008

Vermont passed a new law that made many feel as though their information seeking privacy would not be protected by libraries.

Judah Hamer released an article trying to clarify the new law, stating: “the new law makes it clear that patron records are confidential and can be shared with a third party only in response to a judicial order or warrant.”

Having taking the opinions of parents into consideration the legislators also decided to release information to custodial parents of children under 16.

Information concerning patrons over this age will not be released, due to the importance of the information they may need to seek (child abuse, health related questions, alcoholic parents, etc).

The letter written by Eileen Haupt (which this article is written in response to), suggested that the library in stood in the way of the investigation of the Brooke Bennett case by not immediately complying with police demands to release library computers for investigation.

This article states that the librarian informed police of the legally binding policy that required officers to have a court order, the police officer requested back ups and continued to argue with the librarian.

After the court order was eventually obtained, computers were released.

In response to this event, the author states: “the new law provides greater assurance to patrons across Vermont that their reading habits and research interests are private matters that they alone can decide to share with others”

Muddiest Point: What happens if an issue regarding a library's social site arises and there is no policy? Who constructs the policy?

Friday, November 23, 2012

Week 13 Notes

Charles Allan, “Using a wiki to manage a library instruction program: Sharing knowledge to better serve patrons, C&RL News, April 2007 Vol. 68, No. 4

Wiki is a multi-author collaborative effort so share information.

Can be used by librarians to demonstrate and manage library instruction programs.

Wiki’s can be used as a means of better information sharing, facilitate collaboration of resources and divide workloads among librarians.

Library instruction wikis serve to share information and for the ability to cooperate in creating resources like information handouts and guides.

Many free sites to build wikis are available; the option to upgrade for a fee is usually available also.

At East Tennesse State University the librarian will give instructional sessions for students that highlight specific topics they were asked to cover by the instructor or if a specific topic was not requested, they will give a general overview of information and resources.

Many times new information needs will become apparent during the course of the instructional period- the librarian can then post these needs to the wiki once the answer is found.

Storing information in the wiki about weak points of understanding that different professors or students may have, will let other librarians know to prepare for it and will give them the opportunity to have the solution ready.

Wikis can allow for quick updates to ever-changing information and directions on how to use resources

WIKIPEDIA VIDEO

Wikipedia is truly a global website. Many websites are accessible in different languages.

Only employee is one editor, rest are volunteers

Is more popular than New York Times

Over 90 servers in 3 locations

Stand by crew, 24 hours a day- due to volunteers who manage everything

Wikipedia has been proven to be more accurate than other encyclopedias

Wales says that controversy is not really an issue because most people understand the need for neutrality

Operates on a Neutrality Point of View to avoid controversy within community of Wikipedia contributors

There is software for contributors to immediately see who is editing the pages, so that they may quickly be reviewed for accuracy

On the “voting page” of Wikipedia, different peoples voices carry a different amount of weight.

Wales views himself as a monarch to protect the integrity of the information

Wikipedia (in some eyes) has an edge over textbooks because of its neutrality vs the inherit bias of textbooks

Wikibooks is a new project they are building on.

Digital Librarianship & Social Media: the Digital Library as Conversation Facilitator

Robert A. Schrier
Syracuse University
raschrie@syr.edu

There is an awareness gap between the resources/holdings of digital libraries and the communities they serve

Most ppl use a search engine before they use a library search

There is more information available regarding digital resources and how to build it vs how to market it

The importance of librarians and their ability to communicate to the community the existence and importance of resources has been undermined

The first principle for digital librarians is to listen and engage with users about their needs

Googlealerts, Twitter, Blogsites etc.. are all places to listen.

Many libraries have been failing because even when they use social media to communicate, they put the information out and do not engage or listen to the responses- this makes it seem as if the library does not care.

Trust must be developed by a desire to help people and answer their questions, rather than a desire to promote the library

“Establishing the digital library as an important participant in the knowledge community not only heightens the level of trustworthiness of the collection, it also builds a base of dedicated users that are able to talk to each other.”

Principle Three is transparency. This allows users to contribute to the library and reinforces trusted relationships.

The library needs to move away from the fear of negative response to their public image in order to accomplish this.

Principle Four is policy. Policies should be written and implemented that state and enforce appropriate sharing behaviors, so as to not bring negative feedback to the library.

Principle 5 is Planning. Many libraries create a social networking site and forget to maintain it.

To prevent this it should be included in the marketing and strategic plans.

“If digital libraries truly want to create a lasting and rewarding social media program, they need to think ahead about who will be responsible for creating content, maintaining the site, and responding to users when necessary.”

*ALA link did not work.

*NO MUDDIEST POINT FOR THIS WEEK!

Friday, November 16, 2012

Week 12

Deep web stores content in searchable databases that require specific inquiries

BrightPlanet allows for surface and deep web requests

Searching of both is imperative for the user to retrieve the maximum amount of information.

Current search engines only retrieve 1 of 3,000 pages available.

The World Wide Web is only part of the internet- it includes FTP, email, news, telnet, gopher and other things

Search engine dissatisfaction has increased steadily since 1997.

Search engines crawl or spider to record every hyperlink on pages to gather information

Authors can also submit their pages

Surface web has 2.5 billion documents

BrightPlanet is a directed query engine

The NEC found:

· Surface Web coverage by individual, major search engines has dropped from a maximum of 32% in 1998 to 16% in 1999, with Northern Light showing the largest coverage.

· Metasearching using multiple search engines can improve retrieval coverage by a factor of 3.5 or so, though combined coverage from the major engines dropped to 42% from 1998 to 1999.

· More popular Web documents, that is, those with many link references from other documents, have up to an eight-fold greater chance of being indexed by a search engine than those with no link references.

Deep web contents are on average 27% smaller than surface web
Deep web is 500 times larger than surface web
Searching needs to include the whole web
Directed query technology is the only means to integrate deep and surface Web information.

The simplest crawling algorithm uses a queue of URLs yet to be visited and a fast mechanism

for determining if it has already seen a URL.vechanism for determining if it has already seen a URL.

Crawling requests HTTP to get a page, once it gets the page it scans it for links to other urls

Real crawlers much address:

Speed

Politeness

Excluded content

Duplicate content

Modern spammers create artiﬁcial web landscapes of domains, servers, links, and pages to

inﬂate the link scores of the targets they have been paid to promote. Spammers also engage in

cloaking, the process of delivering different content to crawlers than to site visitors.

An inverted file is a concatenation of the postings lists for each distinct term.

Scanning and inversion create an inverted file

Scaling up merges partial inverted files

Indexers use compression to reduce demands on disk space and memory

Anchor text contributes strongly to the quality of search results.

Average query lengths are two to three words

NO MUDDIEST POINT FOR THIS WEEK

Friday, November 9, 2012

Week 11

Mischo, W. (July/August 2005). Digital Libraries: challenges and influential work. D-Lib Magazine. 11(7/8)

There is a difference between providing digital library services and providing access to digital collections

Gateway and navigation services have been added to address these concerns

“The mantra has been: aggregate, virtually collocate, and federate”

The Digital Libraries Initiative (now called DLI-1), was established in 1994 and federally funded digital library research

Several other organizations contributed funds for research in the following years, totaling $68 in federal funds

Six universities led projects that developed computing and networking technologies. The following universities were included:

“The University of Michigan for research on agent technology and mechanisms for improving secondary education;

Stanford University for the investigation of interoperability among heterogeneous digital libraries and the exploration of distributed object technology;

The University of California-Berkeley for imaging technologies, government environmental information resources, and database technologies;

The University of California-Santa Barbara for the Alexandria Project to develop GIS (Geographical Information Systems) and earth modeling distributed libraries;

Carnegie Mellon University for the study of integrated speech, image, video, and language understanding software under its Informedia system; and

The University of Illinois at Urbana-Champaign for the development of document representation, processing, indexing, search and discovery, and delivery and rendering protocols for full-text physics, computer science, and engineering journals.”

The Illinois project enabled the transmission of technology to publishing partners, which created the contribution of web-based access to full-text and journals.

Most of the publishers that support this feature are very close in structure to that of the Illinois project

Many significant digital library standards and technologies have developed entities outside of the federally funded projects.

Paepcke, A. et al. (July/August 2005). Dewey meets Turing: librarians, computer scientists and the digital libraries initiative. D-Lib Magazine. 11(7/8).

DLI- Digital Libraries Initiative

Librarians were excited about DLI because they knew information technology was important to their impact on the scholarly world

Librarians recognized the value in capabilities, holdings management, and instant access and Online Public Access Catalogs (OPACS) which enabled digital searching

The advent of the internet greatly challenged computer scientist and librarians

Computer scientists and librarians teamed up to create a resource to search, organize, and browse

At times computer scientists believed that librarians would be less relevant but librarians reminded them how much key information is involved with searching that librarians would need to assist in

The line between consumers and producers of information was blurred by the internet

There were impediments in the process in regards to sharing information mostly because of copyright issues

Computer scientists were drawn to this because of the natural connections with information sharing, machine learning, statistical, and other heuristic approaches

After relevance was expanded to include different subject areas, more researchers got involved

The web’s easy retrieval of so much information made people much more relaxed about the accuracy of search results

There are some hard feelings between librarians and computer scientists because some felt the DLI money would be available for the collection but it was not

Some also felt DLI created an environment that made librarians look less relevant

Hubs have re-introduced the notion of collections

Lynch, Clifford A. "Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age" ARL, no. 226

Institutional repositories should include the works of both faculty and students, documentation of events of the institution, and research and teaching materials, experimental and observational data

Institutional repositories are supposed to be a recognition of intellectual life and scholarship that can be shared digitally

Faculty are at a disadvantage (and so are institutions) because they have had to play the role of systems administrators which takes time away from their research/teaching, some do not have the skills to do so and dealing with converting the information to current systems often leaves the information unable to be accessed at some point

It becomes the task of faculty to argue the legitimacy of investing in works of digital scholarship

Preservation is a main requirement to enable this

Scientific journals are accepting articles from other disciplines as “supplementary” materials

Disciplinary repositories will never fully be comprehensive (except for a very few-such as the sciences)

If faculty are properly empowered, institutional repositories can be greatly enhanced with scholarly content

Institutional repositories exert control over what has typically been faculty controlled work

Institutions have been overloaded with irrelevant policy baggage

The quality of these repositories may decrease because institutions may rush to implement them without enforcing quality

Policy, management failure, incompetence, and technical problems could cause them to fail over time

Preservable formats, identifiers and rights and documentation management are vital for institutional repositories

Muddiest Point-

I understood the concept of XML- it just seems as if it requires practice

Friday, November 2, 2012

Week 10

The first article:

https://burks.bton.ac.uk/burks/internet/web/xmlintro.htm

gave me the message: This webpage is not available

The webpage at https://burks.bton.ac.uk/burks/internet/web/xmlintro.htm might be temporarily down or it may have moved permanently to a new web address.

Error 501 (net::ERR_INSECURE_RESPONSE): Unknown error.

I was not able to access it to write notes.

Uche Ogbuji. A survey of XML standards: Part 1. January 2004

XML

XML (which uses Unicode) is the base of the expanding XML technology.

XML has strict rules for text formats and document type definition

The main change of XML is that it adjust the treatment of characters in XML so that it adapts better to the changes in Unicode

Standard Generalized Markup Language (SGML) is used by XML. It is simplified and has adjustments that make it better suited to the web

XML catalogs gives a format as to how XML defines processor resolves XML entity identifiers into documents.

Uniform Resource Identifiers = URIs

URI’s are an extension of URLS. URLS add URNS. These allow web resources to be referred to by name instead of location.

Public identifiers are usually specified as Formal Public Identifiers (FPIs), defined in SGML

XML is catalog is an XML doc, an XML defines a catalog format in a simpler text called OASIS Open Catalog

XML Namespaces enables universal naming of elements.

Namespace allows for the same words to be used multiple times by creating a vocabulary markers and a special syntax for displaying them.

There are mixed opinions on namespace because some people believe the gain is not worth the pain.

Jonathan Borden and Tim Bray (and the rest of the XHTML) community created the Resource Directory Description Language (RDDL), this offers prose descriptions of the vocabulary with embedded XLink to help navigate to key resources to help understand namespace.

Xinclude was still in development at the time of the article. It is used to separate XML documents into “manageable” chunks. With xinclude documents can be separated and pieced back together.

XML Infoset = XML Information Set.

Describes an XML document as a series of objects, called information items, with specialized properties.

Canonical XML Version 1.0 is a standard method for generating a physical representation of an XML document, called the canonical form, that accounts for the variations allowed in XML syntax without changing meaning.

XML Path Language (XPath) 1.0 is a syntax and a data model for addressing parts of an XML document.

XML Schema Part 1: Structures and XML Schema Part 2: Datatypes; the first part allows one to constrain the structure of the document, and the second part allows one to constrain the contents of simple elements and attributes.

This week I did not have a specific muddiest point. CSS seemed very interested, just an overload of information!

Thursday, November 1, 2012

Week 9 CSS

CSS

Webpage designers use HTML to mark up a document’s structure.

Browsers are given instructions from HTML as to how to display elements.

CSS allows the designer to be in control while using CSS and HTML to build the structure and display of content

A rule describes one aspect of style such as color.

A style sheet describes 1 or more rules for HTML

The part before the brace “H1” for example is called the selector. The part within the brackets {color:green} is called the declaration.

The declaration tells what will be done to the selector

A declaration contains a property and value. Color:green

CSS allows the designer to basically short hand directions like:

H1 {

  color: green;

  text-align: center;

All the declarations speak on the selector so they are grouped together and separated by a semi colon

HMTL and CSS have to be joined together to properly present the document. This can be achieved by:

1. Apply the basic, document-wide style sheet for the document by using the style element.

2. Apply a style sheet to an individual element using the style attribute.

3. Link an external style sheet to the document using the link element.

4. Import a style sheet using the CSS @import notation.

The browser needs to be told that CSS needs to be used. That could like this: "text/css."

Inheritance in CSS allows elements to be transferred, like h1 to h2.

Some elements, such as background, do not inherit.

CSS dictates things like font: H1 { font: 36pt serif } and margins margin-top, margin-right, margin-bottom, and margin-left.

Links are also dictated by CSS, determining how the person visiting the site will view them.  An example would be: A:link, A:visited { text-decoration: none }

A:hover { background: cyan }

NO muddiest point