Glossary of a software developer’s language – part three of four
Want to read this article later?
Just tap MyLCN+ to save it to your account
Machine language – is the programming language a computer’s central processing unit (CPU) understands. Unlike high-level programming languages such as Python (introduced below), each instruction in machine language causes the CPU to perform a very specific operation (eg, load, add or jump) on a physical computer component, such as a CPU register or memory. Therefore, machine code (ie, code written in machine language) is regarded as the lowest-level representation of a computer program. Many lines of machine code are needed to complete a program, so the language is designed to be strictly numerical – consisting of zeros and ones – to run as fast as possible. Writing machine code is difficult, tedious and error-prone because programmers need to manage individual bits and calculate memory addresses manually to ensure a successfully running program. More readable high-level languages are then invented with compilers, assemblers and linkers to mitigate this problem, translating high-level code into machine code. A phenomenal and well-renowned female computer scientist, Grace Hopper, was an instrumental figure in this transformation as she invented one of the first compiler-related tools and popularised the idea of machine-independent, English-like programming languages, which led to one of the first high-level languages, COBOL.
Natural language processing (NLP) – is a field in computer science concerned with how computers can process and analyse human language data to understand languages and translate between them. Research began when Alan Turing proposed the Turing test, suggesting that NLP is a criterion of artificial intelligence. Up until the 1980s, languages had been understood using rules humans identified; machine learning algorithms have since sped up the pace of research. With the advent of Google’s Assistant, Amazon’s Alexa and Apple’s Siri, NLP is getting more attention (and personal data), especially in the speech recognition and question answering tasks. However, most data used for research is common conversations people have, not legal jargon or complex essays. Numerous start-ups have begun to create state-of-the-art NLP models, so it will be exciting to see the progress made in the next few years.
Open source – is a term usually attached to software products that allow the public (but mostly developers) around the world to use its source code, documentation and blueprints released under an open source licence. Popular open source software licences are Apache Licence, MIT Licence and GNU General Public Licence (GPL), all of which have varying degrees of flexibility. Owners of open source software may also adopt an open source model for development. By allowing collaboration within the developer community, code can be continuously perfected and new ideas can be made into reality much faster. Examples of successful open source models are TEDx, bitcoin and Wikipedia. Nowadays, many technology companies support this open source movement, uploading their research and small non-profit-making products on GitHub and thus increasing their exposure to developers and the general speed of software development.
Python - is one of the most popular high-level programming languages. Its procedural design means the computer is treated as a dummy that does what it is told. Its readability is one of its selling points as the syntax is like English with simple grammar. Writing Python code comes more easily than many other programming languages such as C++, and Python is therefore a good first language if you intend to learn coding. Programmers are likely to write fewer syntactic errors and can focus on the algorithms and logic (ie, what the computer needs to do step by step to achieve the goals). Besides scripting (code that is written to automate repeated tedious tasks), nowadays Python is used in many machine learning and data analysis tasks. A prerequisite to using one of the most popular machine learning frameworks, TensorFlow, is indeed proficiency in Python.
Ransomware – is a type of malware that prevents users of affected computers from using the computer and accessing files until users pay a ransom. The malware most commonly encrypts the information so that it cannot be opened and the ransomware’s owner is the only one who can decrypt it. The most recent incident with ransomware was WannaCry, which affected the NHS significantly. In most occasions, when a computer is infected with a ransomware, it is because the computer is out of date, thus lacking the security patches to combat the latest advanced cyberattacks. If your computer is infected with ransomware, the first step is disconnecting the machine from any other devices, including mobile phones and external drives. You can then search for a security utility to remove the ransomware, recover the deleted unencrypted files using file recovery tools or restore the files from a recent system backup. No matter what happens, you should notify the police and a lawyer and pay the ransom only as a last resort.
Scraping – is the process of extracting a large amount of information on websites. Though it can mean simply downloading webpages using built-in tools in a browser, this term usually refers to a web spider crawling through a webpage, extracting data and crawling through other webpages when it meets hyperlinks. Scraping to republish the extracted content is obviously a breach of copyright, but scraping itself could be a breach of website use too. Some sophisticated websites, such as Google, can detect and stop the spiders from crawling and scraping. In the United States systematic scraping could even be ‘trespass to chattel’, so beware of what you are scraping.