Unlock the world of neural networks with EasyNN—a user-friendly C++ library that transforms theoretical understanding into practical implementation.
Logistic regression cost function is given as follows
$\large{J(\theta) = \frac{1}{m} \sum_{i=1}^{m} Cost(h_{\theta}(x^{(i)}), y^{(i)})}$
where
$\large{Cost(h_{\theta}(x), y) = \begin{cases} -\log(h_{\theta}(x)), & \text{if } y =1.\\ -\log (1 - h_{\theta}(x)), & \text{if } y = 0. \end{cases}}$
Hence,
$\large{J(\theta) = -\frac{1}{m} \sum_{i=1}^{m} \left[ y^{(i)} \log(h_{\theta}(x^{(i)})) + (1 - y^{(i)}) \log (1 - h_{\theta}(x^{(i)})) \right]}$
where
$m$ is the number of samples.
$h_{\theta}(x^{(i)})$ is the logistic regression hypothesis for feature vector of $i^{th}$ sample.
$y^{(i)}$ is the $i^{th}$ measured value.
This final equation is the one that we will use for the implementation.
The hypothesis used in the equation above is the logistic regression hypothesis as already discussed and implemented here. It is being repeated it here for conveneince
$\large{h_{\theta}(x) = \frac{1}{1 + e^{-\theta^Tx}}}$
The implementation of logistic regression consists of the following steps
In order to implement the cost function would accept the following inputs
double CostFuntionLogistic::evaluate(const std::vector<std::vector<double>>& featuresMatrix, const std::vector<double>& measurementsVector, const std::vector<const double>& parameters) const {
double costSum = 0.0;
for (size_t i = 0; i < featuresMatrix.size(); ++i) {
const auto& x = featuresMatrix[i];
double y = measurementsVector[i];
auto hTheta = hypothesis->evaluate(x, parameters);
auto cost = y * log(hTheta) + (1 - y) * log(1 - hTheta);
costSum += cost;
}
auto m = measurementsVector.size();
return -1.0 / m * costSum;
}
To compute the logistic regression we run a simple loop, extracting the feature vector from the featureMatrix and the corresponding measured value. The hypothesis value is computed based on the feature vector and the measured value. Next, we compute the cost, which is summed over the duration of the loop, which is finally normalized.
Note from the details software design discussion for linear regression cost function, we already know that the hypothesis is part of the ICostFunction interface, hence it is readily available to CostFuntionLogistic::evaluate(…) method.
This implementation can be further improved as follows by using the standard library algorithms:
double CostFuntionLogistic::evaluate(const std::vector<std::vector<double>>& featuresMatrix, const std::vector<double>& measurementsVector, const std::vector<const double>& parameters) const{
auto cost = [¶meters, this](const std::vector<double>& x, double y) -> double {
auto hTheta = hypothesis->evaluate(x, parameters);
auto cost = y * log(hTheta) + (1 - y) * log(1 - hTheta);
return cost;
};
double costSum = std::transform_reduce(std::begin(featuresMatrix), std::end(featuresMatrix), std::begin(measurementsVector), 0.0,
std::plus<>(),
cost);
auto m = measurementsVector.size();
return -1.0 / m * costSum;
}
The cost computation has been moved to a lambda and cost summation calculation is exactly the same as the liner regression cost.
EasyNN currently doesn’t have any explicit tests for logistic regression. However, it is extensively tested while testing the gradient descent for data classification.
Linear Regression
Linear Regression Cost Function
Logistic Regression
Logistic Regression Cost Function
Regularization
Gradient Descent