PHP Autoloading - Yioop as an IR Library




CS267

Chris Pollett

Mar 8, 2021

Outline

Introduction

Static and Const

Cloning

Inheritance

Referring to Parents, Final

Namespaces

Using Namespaces

Namespace Conventions

Autoloading

More Autoloading

Comments on the Yioop Autoloader

Composer

Quiz

Which of the following is true?

  1. Stemming tends to increase precision, but reduce recall.
  2. Mean Average Precision is a score typically calculated over several topics.
  3. Removing stopwords usually increases recall, but reduces precision.

Composer

Example Composer Project Using Yioop

Comments on the Example

<?php
namespace cpollett\test_composer;

use seekquarry\yioop\library as L;
use seekquarry\yioop\library\Library;
use seekquarry\yioop\library\LinearAlgebra as LA;
use seekquarry\yioop\library\FetchUrl;
use seekquarry\yioop\library\PhraseParser;
use seekquarry\yioop\library\CrawlConstants;

require_once "vendor/autoload.php";

/*
   Since a normal Yioop instance needs a Profile.php file to be generated,
   the following is used to set up Yioop in library mode so you don't need this.
   To enable debugging use Library::init(true);
 */
Library::init();

// download a collection of web pages and then pretty print
$page_info = FetchUrl::getPages(
    [[CrawlConstants::URL => "https://www.yahoo.com/"]],
     // we could list more urls to download
);
print_r($page_info);

// stem word or phrases
print_r(PhraseParser::stemTerms("image", 'en-US'));
print_r(PhraseParser::stemTerms("I once knew a jumpy cat", 'en-US'));
print_r(PhraseParser::stemTerms("Allons enfants de la Patrie,
    Le jour de gloire est arrivé!", 'fr-FR'));

// segment strings into words
print_r(PhraseParser::segmentSegment("从前,在一块遥远的土地上", 'zh-CN'));

//detect language from text
$pinocchio = <<< EOD
Come andò che Maestro Ciliegia, falegname trovò un pezzo di legno che
piangeva e rideva come un bambino.

C'era una volta....

Un re! — diranno subito i miei piccoli lettori.

No, ragazzi, avete sbagliato. C'era una volta un pezzo di legno.

Non era un legno di lusso, ma un semplice pezzo da catasta, di quelli
che d'inverno si mettono nelle stufe e nei caminetti per accendere il
fuoco e per riscaldare le stanze.
EOD;

$lang = L\guessLocaleFromString($pinocchio);
echo $lang . "\n";

//make term frequency vector of stemmed terms
$vec = PhraseParser::extractPhrasesAndCount($pinocchio, $lang);
print_r($vec);

// Normalize this vector
$norm = LA::normalize($vec);
print_r($norm);