Monthly Archives: January 2018

Each of us has been faced with the problem of searching for information more than once. Irregardless of the data source we are using (Internet, file system on our hard drive, data base or a global information system of a big company) the problems can be multiple and include the physical volume of the data base searched, the information being unstructured, different file types and also the complexity of accurately wording the search query. We have already reached the stage when the amount of data on one single PC is comparable to the amount of text data stored in a proper library. And as to the unstructured data flows, in future they are only going to increase, and at a very rapid tempo. If for an average user this might be just a minor misfortune, for a big company absence of control over information can mean significant problems. So the necessity to create search systems and technologies simplifying and accelerating access to the necessary information, originated long ago. Such systems are numerous and moreover not every one of them is based on a unique technology. And the task of choosing the right one depends directly on the specific tasks to be solved in the future. While the demand for the perfect data searching and processing tools is steadily growing let's consider the state of affairs with the supply side.

Not going deeply into the various peculiarities of the technology, all the searching programs and systems can be divided into three groups. These are: global Internet systems, turnkey business solutions (corporate data searching and processing technologies) and simple phrasal or file search on a local computer. Different directions presumably mean different solutions.

Local search

Everything is clear about search on a local PC. It's not remarkable for any particular functionality features accept for the choice of file type (media, text etc.) and the search destination. Just enter the name of the searched file (or part of text, for example in the Word format) and that's it. The speed and result depend fully on the text entered into the query line. There is zero intellectuality in this: simply looking through the available files to define their relevance. This is in its sense explicable: what's the use of creating a sophisticated system for such uncomplicated needs.

Global search technologies

Matters stand totally different with the search systems operating in the global network. One can't rely simply on looking through the available data. Huge volume (Yandex for instance can boast the indexing capacity of more than 11 terabyte of data) of the global chaos of unstructured information will make the simple search not only ineffective but also long and labor-consuming. That's why lately the focus has shifted towards optimizing and improving quality characteristics of search. But the scheme is still very simple (except for the secret innovations of every separate system) - the phrasal search through the indexed data base with proper consideration for morphology and synonyms. Undoubtedly, such an approach works but doesn't solve the problem completely. Reading dozens of various articles dedicated to improving search with the help of Google or Yandex, one can drive at the conclusion that without knowing the hidden opportunities of these systems finding a relevant document by the query is a matter of more than a minute, and sometimes more than an hour. The problem is that such a realization of search is very dependent on the query word or phrase, entered by the user. The more indistinct the query the worse is the search. This has become an axiom, or dogma, whichever you prefer.

Of course, intelligently using the key functions of the search systems and properly defining the phrase by which the documents and sites are searched, it is possible to get acceptable results. But this would be the result of painstaking mental work and time wasted on looking through irrelevant information with a hope to at least find some clues on how to upgrade the search query. In general, the scheme is the following: enter the phrase, look through several results, making sure that the query was not the right one, enter a new phrase and the stages are repeated till the relevancy of results achieves the highest possible level. But even in that case the chances to find the right document are still few. No average user will voluntary go for the sophistication of "advanced search" (although it is equipped with a number of very useful functions such as the choice of language, file format etc.). The best would be to simply insert the word or phrase and get a ready answer, without particular concern for the means of getting it. Let the horse think - it has a big head. Maybe this is not exactly up to the point, but one of the Google search functions is called "I am feeling lucky!" characterizes very well the existent searching technologies. Nevertheless, the technology works, not ideally and not always justifying the hopes, but if you allow for the complexity of searching through the chaos of Internet data volume, it could be acceptable.

Corporate systems

The third on the list are the turnkey solutions based on the searching technologies. They are meant for serious companies and corporations, possessing really large data bases and staffed with all sorts of information systems and documents. In principle, the technologies themselves can also be used for home needs. For example, a programmer working remotely from the office will make good use of the search to access randomly located on his hard drive program source codes. But these are particulars. The main application of the technology is still solving the problem of quickly and accurately searching through large data volumes and working with various information sources. Such systems usually operate by a very simple scheme (although there are undoubtedly numerous unique methods of indexing and processing queries underneath the surface): phrasal search, with proper consideration for all the stem forms, synonyms etc. which once again leads us to the problem of human resource. When using such technology the user should first word the query phrases which are going to be the search criteria and presumably met in the necessary documents to be retrieved. But there is no guarantee that the user will be able to independently choose or remember the correct phrase and furthermore, that the search by this phrase will be satisfactory.

One more key moment is the speed of processing a query. Of course, when using the whole document instead of a couple of words, the accuracy of search increases manifold. But up to date, such an opportunity has not been used because of the high capacity drain of such a process. The point is that search by words or phrases will not provide us with a highly relevant similarity of results. And the search by phrase equal in its length the whole document consumes much time and computer resources. Here is an example: while processing the query by one word there is no considerable difference in speed: whether it's 0,1 or 0,001 second is not of crucial importance to the user. But when you take an average size document which contains about 2000 unique words, then the search with consideration for morphology (stem forms) and thesaurus (synonyms), as well as generating a relevant list of results in case of search by key words will take several dozens of minutes (which is unacceptable for a user).

The interim summary

As we can see, currently existing systems and search technologies, although properly functioning, don't solve the problem of search completely. Where speed is acceptable the relevancy leaves more to be desired. If the search is accurate and adequate, it consumes lots of time and resources. It is of course possible to solve the problem by a very obvious manner - by increasing the computer capacity. But equipping the office with dozens of ultra-fast computers which will continuously process phrasal queries consisting of thousands of unique words, struggling through gigabytes of incoming correspondence, technical literature, final reports and other information is more than irrational and disadvantageous. There is a better way.

The unique similar content search

At present many companies are intensively working on developing full text search. The calculation speeds allow creating technologies that enable queries in different exponents and wide array of supplementary conditions. The experience in creating phrasal search provides these companies with an expertise to further develop and perfect the search technology. In particular, one of the most popular searches is the Google, and namely one of its functions called the "similar pages". Using this function enables the user to view the pages of maximum similarity in their content to the sample one. Functioning in principle, this function does not yet allow getting relevant results - they are mostly vague and of low relevancy and furthermore, sometimes utilizing this function shows complete absence of similar pages as a result. Most probably, this is the result of the chaotic and unstructured nature of information in the Internet. But once the precedent has been created, the advent of the perfect search without a hitch is just a matter of time.

What concerns the corporate data processing and knowledge retrieval systems, here the matters stand much worse. The functioning (not existing on paper) technologies are very few. And no giant or the so called search technology guru has so far succeeded in creating a real similar content search. Maybe, the reason is that it's not desperately needed, maybe - too hard to implement. But there is a functioning one though.

SoftInform Search Technology, developed by SoftInform, is the technology of searching for documents similar in their content to the sample. It enables fast and accurate search for documents of similar content in any volume of data. The technology is based on the mathematical model of analyzing the document structure and selecting the words, word combinations and text arrays, which results in forming a list of documents of maximum similarity the sample text abstract with the relevancy percent defined. In contrast to the standard phrasal search by the similar content search there is no need to determine the key words beforehand - the search is conducted through the whole document. The technology works with several sources of information that can be stored both in text files of txt, doc, rtf, pdf, htm, html formats, and the information systems of the most popular data bases (Access, MS SQL, Oracle, as well as any SQL-supporting data bases). It also additionally supports the synonyms and important words functions that enable to carry out a more specific search.

The similar search technology enables to significantly cut time wasted on searching and reviewing the same or very similar documents, diminish the processing time at the stage of entering data into the archive by avoiding the duplicate documents and forming sets of data by a certain subject. Another advantage of the SoftInform technology is that it's not so sensitive to the computer capacity and allows processing data at a very high speed even on ordinary office computers.

This technology is not just a theoretic development. It has been tested and successfully implemented in a project of giving legal advice via phone, where the speed of information retrieval is of crucial importance. And it will undoubtedly be more than useful in any knowledge base, analytical service and support department of any large firm. Universality and effectiveness of the SoftInform Search Technology allows solving a wide spectrum of problems, arising while processing information. These include the fuzziness of information (at the document entering stage it is possible to immediately define whether such a document already belongs to the data base or not) and the similarity analysis of the documents which are already entered into the data base, and the search for semantically similar documents which saves time spent on selecting the appropriate key words and viewing the irrelevant documents.

Perspectives

Besides its primary assignment (fast and high quality search for information in huge volume such as texts, archives, data bases) an Internet direction could also be defined. For example, it is possible to work out an expert system to process incoming correspondence and news which will become an important tool for analysts from different companies. Mainly, this will be possible due to the unique similar content search technology, absent from any of the existent systems so far except for the SearchInform. The problem of spamming search engines with the so called doorways (hidden pages with key words redirecting to the site's main pages and used to increase the page rating with the search engines) and the e-mail spam problem (a more intellectual analysis would ensure higher level of security) would also be solved with the help of this technology. But the most interesting perspective of the SoftInform Search technology is creating a new Internet search engine, the main competitive advantage of which would be ability to search not just by key words, but also for similar web pages, which will add to the flexibility of search making it more comfortable and efficient.

To draw a conclusion, it could be stated with confidence that the future belongs to the full text search technologies, both in the Internet and the corporate search systems. Unlimited development potential, adequacy of the results and processing speed of any size of query make this technology much more comfortable and in high demand. SoftInform Search technology might not be the pioneer, but it's a functioning, stable and unique one with no existent analogues (which can be proved by the active Eurasian patent). To my mind, even with the help of the "similar search" it will be difficult to find a similar technology.

In the past few years of research on instructional technology has resulted in a clearer vision of how technology can affect teaching and learning. Today, almost every school in the United States of America uses technology as a part of teaching and learning and with each state having its own customized technology program. In most of those schools, teachers use the technology through integrated activities that are a part of their daily school curriculum. For instance, instructional technology creates an active environment in which students not only inquire, but also define problems of interest to them. Such an activity would integrate the subjects of technology, social studies, math, science, and language arts with the opportunity to create student-centered activity. Most educational technology experts agree, however, that technology should be integrated, not as a separate subject or as a once-in-a-while project, but as a tool to promote and extend student learning on a daily basis.

Today, classroom teachers may lack personal experience with technology and present an additional challenge. In order to incorporate technology-based activities and projects into their curriculum, those teachers first must find the time to learn to use the tools and understand the terminology necessary for participation in projects or activities. They must have the ability to employ technology to improve student learning as well as to further personal professional development.

Instructional technology empowers students by improving skills and concepts through multiple representations and enhanced visualization. Its benefits include increased accuracy and speed in data collection and graphing, real-time visualization, the ability to collect and analyze large volumes of data and collaboration of data collection and interpretation, and more varied presentation of results. Technology also engages students in higher-order thinking, builds strong problem-solving skills, and develops deep understanding of concepts and procedures when used appropriately.

Technology should play a critical role in academic content standards and their successful implementation. Expectations reflecting the appropriate use of technology should be woven into the standards, benchmarks and grade-level indicators. For example, the standards should include expectations for students to compute fluently using paper and pencil, technology-supported and mental methods and to use graphing calculators or computers to graph and analyze mathematical relationships. These expectations should be intended to support a curriculum rich in the use of technology rather than limit the use of technology to specific skills or grade levels. Technology makes subjects accessible to all students, including those with special needs. Options for assisting students to maximize their strengths and progress in a standards-based curriculum are expanded through the use of technology-based support and interventions. For example, specialized technologies enhance opportunities for students with physical challenges to develop and demonstrate mathematics concepts and skills. Technology influences how we work, how we play and how we live our lives. The influence technology in the classroom should have on math and science teachers' efforts to provide every student with "the opportunity and resources to develop the language skills they need to pursue life's goals and to participate fully as informed, productive members of society," cannot be overestimated.

Technology provides teachers with the instructional technology tools they need to operate more efficiently and to be more responsive to the individual needs of their students. Selecting appropriate technology tools give teachers an opportunity to build students' conceptual knowledge and connect their learning to problem found in the world. The technology tools such as Inspiration® technology, Starry Night, A WebQuest and Portaportal allow students to employ a variety of strategies such as inquiry, problem-solving, creative thinking, visual imagery, critical thinking, and hands-on activity.

Benefits of the use of these technology tools include increased accuracy and speed in data collection and graphing, real-time visualization, interactive modeling of invisible science processes and structures, the ability to collect and analyze large volumes of data, collaboration for data collection and interpretation, and more varied presentations of results.

Technology integration strategies for content instructions. Beginning in kindergarten and extending through grade 12, various technologies can be made a part of everyday teaching and learning, where, for example, the use of meter sticks, hand lenses, temperature probes and computers becomes a seamless part of what teachers and students are learning and doing. Contents teachers should use technology in ways that enable students to conduct inquiries and engage in collaborative activities. In traditional or teacher-centered approaches, computer technology is used more for drill, practice and mastery of basic skills.

The instructional strategies employed in such classrooms are teacher centered because of the way they supplement teacher-controlled activities and because the software used to provide the drill and practice is teacher selected and teacher assigned. The relevancy of technology in the lives of young learners and the capacity of technology to enhance teachers' efficiency are helping to raise students' achievement in new and exciting ways.

As students move through grade levels, they can engage in increasingly sophisticated hands-on, inquiry-based, personally relevant activities where they investigate, research, measure, compile and analyze information to reach conclusions, solve problems, make predictions and/or seek alternatives. They can explain how science often advances with the introduction of new technologies and how solving technological problems often results in new scientific knowledge. They should describe how new technologies often extend the current levels of scientific understanding and introduce new areas of research. They should explain why basic concepts and principles of science and technology should be a part of active debate about the economics, policies, politics and ethics of various science-related and technology-related challenges.

Students need grade-level appropriate classroom experiences, enabling them to learn and to be able to do science in an active, inquiry-based fashion where technological tools, resources, methods and processes are readily available and extensively used. As students integrate technology into learning about and doing science, emphasis should be placed on how to think through problems and projects, not just what to think.

Technological tools and resources may range from hand lenses and pendulums, to electronic balances and up-to-date online computers (with software), to methods and processes for planning and doing a project. Students can learn by observing, designing, communicating, calculating, researching, building, testing, assessing risks and benefits, and modifying structures, devices and processes - while applying their developing knowledge of science and technology.
Most students in the schools, at all age levels, might have some expertise in the use of technology, however K-12 they should recognize that science and technology are interconnected and that using technology involves assessment of the benefits, risks and costs. Students should build scientific and technological knowledge, as well as the skill required to design and construct devices. In addition, they should develop the processes to solve problems and understand that problems may be solved in several ways.

Rapid developments in the design and uses of technology, particularly in electronic tools, will change how students learn. For example, graphing calculators and computer-based tools provide powerful mechanisms for communicating, applying, and learning mathematics in the workplace, in everyday tasks, and in school mathematics. Technology, such as calculators and computers, help students learn mathematics and support effective mathematics teaching. Rather than replacing the learning of basic concepts and skills, technology can connect skills and procedures to deeper mathematical understanding. For example, geometry software allows experimentation with families of geometric objects, and graphing utilities facilitate learning about the characteristics of classes of functions.

Learning and applying mathematics requires students to become adept in using a variety of techniques and tools for computing, measuring, analyzing data and solving problems. Computers, calculators, physical models, and measuring devices are examples of the wide variety of technologies, or tools, used to teach, learn, and do mathematics. These tools complement, rather than replace, more traditional ways of doing mathematics, such as using symbols and hand-drawn diagrams.

Technology, used appropriately, helps students learn mathematics. Electronic tools, such as spreadsheets and dynamic geometry software, extend the range of problems and develop understanding of key mathematical relationships. A strong foundation in number and operation concepts and skills is required to use calculators effectively as a tool for solving problems involving computations. Appropriate uses of those and other technologies in the mathematics classroom enhance learning, support effective instruction, and impact the levels of emphasis and ways certain mathematics concepts and skills are learned. For instance, graphing calculators allow students to quickly and easily produce multiple graphs for a set of data, determine appropriate ways to display and interpret the data, and test conjectures about the impact of changes in the data.